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BOX PCT 

Commissioner for Patents 
Washington, DC 20231 

PRELIMINARY AMENDMENT 

Sir: 

Please amend the above-referenced application as follows prior to 
substantive examination. 

In the Specification . 

After the title, please insert the following: 

- Related Application Information 

This application claims the benefit under 35 U.S.C. § 371 from PCT 
Application No. PCT/CA99/00656, filed July 20, 1999, the disclosure of which 
is incorporated by reference herein in its entirety, which claims the benefit of 
Canadian Application Serial No. 2,237,701, filed July 20, 1998 and Canadian 
Application Serial No. 2,253,647, filed December 10, 1998, the disclosures of 
which are incorporated by reference herein in their entirety.- 

In the Claims . 

Please amend the claims as follows. 

1 3. (Amended) The isolated polynucleotide of claim 6 [any one of the 
preceding claims] wherein the polynucleotide is a polydeoxyribonucleotide. 



14. (Amended) The isolated polynucleotide of claim 6 [any one of claims 
1 to 11] wherein the polynucleotide is a polyribonucleotide. 
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16. (Amended) A recombinant vector comprising the isolated 
polynucleotide of claim 6 [any one of claims 1 to 15]. 



43. (Amended) The transgenic animal of claim 42 [L01] wherein the 
polynucleotide encodes a human SARA protein or a portion thereof. 



Claims 1-44 are pending in this application. Claims 13, 14 and 16 have 
been amended herein to remove multiple dependencies from the claims. 
Claim 43 has been amended to correct a typographical error. It is submitted 
that this application is now in condition for substantive examination, which 
action is respectfully requested. 
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SARA PROTEINS 



Fipld of the Invention 

The invention relates to a family of proteins, the SARA proteins, which 
5 bind to receptor-regulated Smad proteins and are involved in appropriate 
localization of these Smad proteins for receptor activation. 



Background of the Invention 

The Transforming Growth Factor-beta (TGFp) superfamily, whose 
10 members include TGFps, activins and bone morphogenetic proteins (BMPs), 

have wide ranging effects on cells of diverse origins (Attisano and Wrana, 1 998; 
Heldin et al., 1997; Kretzschmar and Massague, 1998). Signaling by these 
secreted factors is initiated upon interaction with a family of cell-surface 
transmembrane serine/threonine kinases, known as type I and type II receptors. 
1 5 Ligand induces formation of a typel/typell heteromeric complex which permits 
the constitutively active type II receptor to phosphorylate, and thereby activate, 
the type I receptor (Wrana et al., 1994). This activated type I receptor then 
propagates the signal to a family of intracellular signaling mediators known as 
Smads (Attisano and Wrana, 1998; Heldin et al., 1997; Kretzschmar and 
20 Massague, 1998). 

The first members of the Smad family identified in invertebrates were the 
Drosophila MAD and the C elegans sma genes (sma-2, sma-3 and sma-4; Savage 
et al., 1996; Sekelsky et al., 1995). Currently, the family includes additional 
invertebrate Smads, as well as nine vertebrate members, Smadl through 9 
25 (Attisano and Wrana, 1998; Heldin et al., 1997; Kretzschmar and Massague, 

1998). Smad proteins contain two conserved amino (MH1) and carboxy (MH2) 
terminal regions separated by a more divergent linker region. In general, Smad 
proteins can be subdivided into three groups; the receptor-regulated Smads, 
which include Smad 1, 2, 3, 5 and 8, Mad, sma-2 and sma-3; the common 
30 Smads, Smad4 and Medea, and the antagonistic Smads, which include Smad6, 7 
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and 9, DAD and daf-3 (Heldin et al., 1997; Nakayama et al., 1998; Patterson et 
al., 1997). 

Numerous studies with vertebrate Smad proteins have provided insights 
into the differential functions of these proteins in mediating signaling. Receptor- 
5 regulated Smads are direct substrates of specific type I receptors and the proteins 
are phosphorylated on the last two serines at the carboxy-terminus within a 
highly conserved SSXS motif (Abdollah et al., 1997; Kretzschmar et al., 1997; 
Liu et al., 1997b; Macias-Silva et al., 1996; Souchelnytskyi et al., 1997). 
Interestingly, Smad2 and Smad3 are substrates of TGFB or activin receptors and 
1 0 mediate signaling by these ligands (Liu et al., 1 997b; Macias-Silva et al., 1 996; 
Nakao et al., 1 997a), whereas Smadl , 5 and 8 appear to be targets of BMP 
receptors and thereby propagate BMP signals (Chen et al., 1997b; Hoodless et 
al., 1995; Kretzschmar et al., 1997; Nishimura et al., 1998). Once 
phosphorylated, these Smads bind to the common Smad, Smad4, which lacks 
1 5 the carboxy-terminal phosphorylation site and is not a target for receptor 
phosphorylation (Lagna et al., 1996; Zhang et al., 1997). Heteromeric 
complexes of the receptor-regulated Smad and Smad4 translocate to the nucleus 
where they function to regulate the transcriptional activation of specific target 
genes. The antagonist Smads, Smad6, 7 and 9 appear to function by blocking 
20 ligand-dependent signaling by preventing access of receptor-regulated Smads to 
the type I receptor or possibly by blocking formation of heteromeric complexes 
with Smad4 (reviewed in Heldin et al., 1997). 

Analysis of the nuclear function of Smads has demonstrated that Smads 
can act as transcriptional activators and that some Smads, including Drosophila 
25 Mad, and the vertebrate Smad3 and Smad4, can bind directly to DNA, albeit at 
relatively low specificity and affinity (Dennler et al., 1998; Kim et al., 1997; 
Labbeetal., 1998; Yinglinget al., 1997; Zawel et al., 1998). 

Localization of Smads is critical in controlling their activity and Smad 
phosphorylation by the type I receptor regulates Smad activity by inducing 
30 nuclear accumulation (Attisano and Wrana, 1998; Heldin et al., 1997; 

Kretzschmar and Massague, 1998). However, little is known about how Smad 
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localization is controlled prior to phosphorylation and how this might function 
in modulating receptor interactions with its Smad substrates. 



Summary of the Invention 
5 Smad proteins (Smads) transmit signals from transmembrane ser/thr kinase 

receptors to the nucleus. Mammalian and non-mammalian proteins have been 
identified which interact directly with Smads and are designated the Smad 
Anchor for Receptor Activation or SARA proteins. 

The invention provides cDNA sequences encoding this previously 

1 0 undescribed family of SARA proteins which bind to receptor-regulated Smad 
proteins and ensure appropriate localization of these Smad proteins for 
activation by a Type I receptor of a TGFp, activin or BMP signaling pathway. 

For example, TGFp signaling induces dissociation of Smad2 or Smad3 
from a SARA protein with concomitant formation of Smad2/Smad4 or 

1 5 Smad3/Smad4 complexes and nuclear translocation. In the absence of signaling, 
SARA functions to recruit a particular Smad (eg. Smad2 or Smad3) to distinct 
subcellular sites in the cell and interacts with the TGFp superfamily receptor 
complex in cooperation with the particular receptor regulated Smad. Mutations 
in hSARAI that cause mislocalization of Smad2, and interfere with receptor 

20 association, inhibit receptor-dependent transcriptional responses, indicating that 
regulation of Smad localization is essential for TGFp superfamily signaling. The 
invention provides a novel component of the signal transduction pathway that 
functions to anchor Smads to specific subcellular sites for activation by the Type 
I receptor of the TGFp, activin or BMP signaling pathways. 

25 The SARA proteins are characterised by the presence of three domains, a 

double zinc finger or FYVE domain responsible for the subcellular localisation of 
the SARA protein or SARA-Smad complex, a Smad-binding domain which 
mediates the interaction or binding of one or more species of Smad protein and 
a carboxy terminal domain which mediates association with the TGFP 

30 superfamily receptor. The FYVE domain may bind phosphatidyl inositol-3- 
phosphate. 
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In accordance with one embodiment, the invention provides isolated 
polynucleotides comprising nucleotide sequences encoding SARA proteins. 

In accordance with a further series of embodiments, the invention 
provides an isolated polynucleotide selected from the group consisting of 
5 (a) a nucleotide sequence encoding a human SARA protein; 

(b) a nucleotide sequence encoding a mammalian SARA protein; 

(c) a nucleotide sequence encoding a non-mammalian SARA 
protein; 

(d) a nucleotide sequence encoding the human SARA amino acid 
1 0 sequence of Table 2 (hSARAI : Sequence ID NO:2); 

(e) a nucleotide sequence encoding the human SARA amino acid 
sequence of Table 4 (hSARA2: Sequence ID NO:4); 

(f) a nucleotide sequence encoding the Xenopus SARA amino acid 
sequence of Table 6 (XSARA1 : Sequence ID NO:6); 

15 ( g ) a nucleotide sequence encoding the Xenopus SARA amino acid 

sequence of Table 8 (XSARA2: Sequence ID NO:8). , 

In accordance with a further embodiment, the invention provides the 
nucleotide sequences of Table 1 (human SARA1 or hSARAJJ, Table 3 (human 
SARA2 or hSARA2) , Table 5 (Xenopus SARA1 or XSARA1) and Table 7 (Xenopus 

20 SARA2 or XSARA2). 

In accordance with a further embodiment, the invention provides 
recombinant vectors including the polynucleotides disclosed herein and host 
cells transformed with these vectors. 

The invention further provides a method for producing SARA proteins, 
25 comprising culturing such host cells to permit expression of a SARA protein- 
encoding polynucleotide and production of the protein. 

The invention also includes polynucleotides which are complementary to 
the disclosed nucleotide sequences, polynucleotides which hybridize to these 
sequences under high stringency and degeneracy equivalents of these 
30 sequences. 
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In accordance with a further embodiment, the invention provides 
antisense molecules which may be used to prevent expression of a SARA 
protein. Such antisense molecules can be synthesised by methods known to 
those skilled in the art and include phosphorothioates and similar compounds. 
5 The invention further includes polymorphisms and alternatively spliced 

versions of the disclosed SARA genes and proteins wherein nucleotide or amino 
acid substitutions or deletions do not substantially affect the functioning of the 
gene or its encoded protein. 

The invention also enables the identification and isolation of allelic 
1 0 variants or homologues of the described SARA genes, and their corresponding 
proteins, using standard hybridisation screening or PCR techniques. 

The invention provides a method for identifying allelic variants or 
homologues of the described SARA genes, comprising 

choosing a nucleic acid probe or primer capable of hybridizing to a SARA 
1 5 gene sequence under stringent hybridisation conditions; 

mixing the probe or primer with a sample of nucleic acids which may 
contain a nucleic acid corresponding to the variant or homologue; and 

detecting hybridisation of the probe or primer to the nucleic acid 
corresponding to the variant or homologue. 
20 In accordance with a further embodiment, the invention provides 

fragments of the disclosed polynucleotides, such as polynucleotides of at least 
10, preferably 15, more preferably 20 consecutive nucleotides of the disclosed 
polynucleotide sequences. These fragments are useful as probes and PCR 
primers or for encoding fragments, functional domains or antigenic determinants 

25 of SARA proteins. 

In accordance with a further embodiment, the invention provides 
substantially purified SARA proteins, including the proteins of Table 2 (hSARAD, 
Table 4 (hSARA2), Table 6 (XSARA1) and Table 8 (XSARA2). 

In accordance with one embodiment, a SARA protein has a FYVE domain, 
30 a Smad binding domain (SBD) and an amino acid sequence having at least 50% 
overall identity with the amino acid sequence of hSARAI (Sequence ID NO:2). 
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In accordance with a preferred embodiment, a SARA protein has a FYVE 
domain having at least 65% identity of amino acid sequence with the FYVE 
domain of hSARAI and a C-terminal sequence of 550 consecutive amino acids 
which have at least 50% identity with the C-terminal 550 amino acid residues of 
5 hSARAI. 

In accordance with a more preferred embodiment, a SARA protein has a 
FYVE domain having at least 65% identity of amino acid sequence with the 
FYVE domain of hSARAI and wherein the portion of the SBD corresponding to 
amino acid residues 721 to 740 of hSARAI has at least 80% identity with that 
10 portion of hSARAI. 

The invention further provides a method for producing antibodies which 
selectively bind to a SARA protein comprising the steps of 

administering an immunogenically effective amount of a SARA 
immunogen to an animal; 
1 5 allowing the animal to produce antibodies to the immunogen; and 

obtaining the antibodies from the animal or from a cell culture derived 
therefrom. 

The invention further provides substantially pure antibodies which bind 
selectively to an antigenic determinant of a SARA protein. The antibodies of the 
20 invention include polyclonal antibodies, monoclonal antibodies and single chain 
antibodies. 

The invention includes analogues of the disclosed protein sequences, 
having conservative amino acid substitutions therein. The invention also 
includes fragments of the disclosed protein sequences, such as peptides of at 
25 least 6, preferably 10, more preferably 20 consecutive amino acids of the 
disclosed protein sequences. 

The invention further provides polypeptides comprising at least one 
functional domain or at least an antigenic determinant of a SARA protein. 

In accordance with a further embodiment, the invention provides 
30 peptides which comprise SARA protein Smad binding domains and 
polynucleotides which encode such peptides. 
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In accordance with a further embodiment, the invention provides a Smad 
binding domain peptide selected from the group consisting of 

(a) SASSQSPNPNNPAEYCSTIPPLQQAQASGALSSPPPTVMVPVCV 

LKHPGAEVAQPREQRRVWFADGILPNGEVADAAKLTMNGTSS; and 
5 (b) amino acids 589 to 672 of the XSARA1 sequence of Table 9. 

The invention includes fragments and variants of these Smad binding 
domain peptides which retain the ability to bind a Smad protein. 

In accordance with a further embodiment, the invention provides 
peptides which comprise SARA protein FYVE domains and polynucleotides 
1 0 which encode such peptides. 

In accordance with a further embodiment, the invention provides a FYVE 
domain peptide selected from the group consisting of 

(a) amino acids 587 to 655 of the hSARAI sequence of Table 9; 

(b) amino acids 510 to 578 of the XSARA1 sequence of Table 9; and 
1 5 (c) the consensus amino acid sequence of Table 1 0. 

The invention includes fragments and variants of these FYVE domain 
peptides which retain the function of the parent peptide. 

In accordance with a further embodiment, the invention provides 
peptides which comprise SARA protein TGFp receptor interacting domains and 
20 polynucleotides which encode such peptides. 

In accordance with a further embodiment, the invention provides a TGFp 
receptor interacting domain peptide comprising amino acids 751 to 1 323 of the 
hSARAI sequence of Table 9. 

The invention includes fragments and variants of these TGFp receptor 
25 binding domain peptides which retain the binding ability of the parent peptide. 

The invention further provides methods for modulating signaling by 
members of the TGFp superfamily which signal through pathways which involve 
a SARA protein. 

Modulation of signaling by a TGFP superfamily member through such a 
30 pathway may be effected, for example, by increasing or reducing the binding of 
the SARA protein involved in the pathway with its binding partner. 
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In accordance with a further embodiment, TGFp superfamily signaling, 
including TGFp signaling, by a pathway involving a SARA protein described 
herein may be modulated by modulating the binding of the SARA protein to a 
Smad binding partner, by modulating the binding of its FYVE domain to its 
5 binding partner or by modulating the binding of the SARA protein to a TGFp 
superfamily receptor, such as the TGFp receptor. 

For example, the binding of a SARA protein to a Smad binding partner 
may be inhibited by a deletion mutant of the protein lacking either the SBD 
domain or the FYVE domain or by the SARA protein Smad binding domain 

1 0 peptides or FYVE domain peptides described herein, and effective fragments or 
variants thereof. The binding of a SARA protein to a TGFp superfamily receptor 
may be inhibited by a deletion mutant of the protein lacking a C terminal portion 
or by the SARA protein TGFp receptor binding domain peptides described 
herein, and effective fragments and variants thereof. 

15 In accordance with a further embodiment, TGFp superfamily signaling, 

including TGFp signaling, by a pathway involving a SARA protein may be 
modulated by modulating the binding of the SARA protein FYVE domain to 
phosphatidyl inositol-3-phosphate, by increasing or decreasing the availability of 
phosphatidyl inositol-3-phosphate or by administration of agonists or antagonists 

20 of phosphatidyl inositol-3-phosphate kinase. 

The invention also provides a method of modulating a TGFp superfamily 
signaling pathway involving phosphatidyl inositoI-3-phosphate, including a 
TGFp signaling pathway, by increasing or decreasing the availability of SARA 
protein or by modulating the function of SARA protein. 

25 The invention further provides methods for preventing or treating diseases 

characterised by an abnormality in a TGFp superfamily member signaling 
pathway which involves a SARA protein, by modulating signaling in the 
pathway, as described above. 

TGFp signaling is important in wound healing, and excessive signaling is 

30 associated with scarring, with arthritis and with fibrosis in numerous diseases, 
including fibrosis of the liver and kidney. TGFp signaling is also involved in 
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modulating inflammatory and immune responses and can contribute to tumour 
progression. 

The invention thus provides methods for modulating TGFp-dependent 
cell proliferation or fibrogenesis. 
5 The BMP signaling pathways are important in tissue morphogenesis and 

in protecting tissues and restoring or regenerating tissues after tissue damage, for 
example in bone, kidney, liver and neuronal tissue (see, for example, (Reddy, 
A.H. (1998), Nature Biotechnology, v. 16, pp. 247-252). 

The invention further provides methods for modulating BMP-dependent 
1 0 phenotypic marker expression by modulating the interactions of SARA proteins 
involved in these BMP signaling pathways. 

In accordance with a further embodiment, modified versions of a SARA 
protein may be provided as dominant-negatives that block TGFp superfamily 
signaling. These modified versions of SARA could, for example, lack the Smad 
1 5 binding domain and thereby prevent recruitment of Smad or could lack the FYVE 
domain and thereby inhibit signaling by interfering with translocation. 

These modified versions of SARA may be provided by gene therapy, for 
example using transducing viral vectors. Expression may be driven by inclusion 
in the vector of a promoter specific for a selected target cell type. Many 
20 examples of such specific promoters are known to those skilled in the art. 

In a further embodiment, a normal version of a SARA protein such as 
hSARAI could be provided by gene therapy to restore function in a disease 
wherein SARA is mutated or non-functional. 

In a further embodiment, the invention provides a pharmaceutical 
25 composition comprising a purified SARA protein as active ingredient. 

In accordance with a further embodiment, the invention provides non- 
human transgenic animals and methods for the production of non-human 
transgenic animals which afford models for further study of the SARA system and 
tools for screening of candidate compounds as therapeutics. For example, knock 
30 out animals, such as mice, may be produced with deletion of a SARA gene. 
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These animals may be examined for phenotypic changes and used to screen 
candidate compounds for effectiveness to reverse these changes. 

In a further example, transgenic animals may be produced expressing a 
dominant negative mutant of a SARA protein, as described above, either 
5 generally or in specific targeted tissues. 

The invention provides many targets for the development of small 
molecule drugs, including peptides and peptidomimetic drugs, to interfere with 
the interaction of the various binding partners described herein and thereby 
modulate signaling by members of the TGFp superfamily, including TGFp and 
10 BMPs. 

The invention further provides methods for screening candidate 
compounds to identify those able to modulate signaling by a member of the 
TGFp superfamily through a pathway involving a SARA protein. 

For example, the invention provides screening methods for compounds 
1 5 able to bind to a SARA protein which are therefore candidates for modifying the 
activity of the SARA protein. Various suitable screening methods are known to 
those in the art, including immobilization of a SARA protein on a substrate and 
exposure of the bound SARA protein to candidate compounds, followed by 
elution of compounds which have bound to the SARA protein. The methods 
20 used to characterise the binding interactions of the SARA proteins disclosed 

herein, as fully described in the examples herein, may also be used to screen for 
compounds which are agonists or antagonists of the binding of a SARA protein. 

This invention also provides methods of screening for compounds which 
modulate TGFp superfamily signaling by detecting an alteration in the 
25 phosphorylation state of a SARA protein. 

In accordance with a further embodiment, the invention provides a 
method for reducing or preventing TGFp, activin or BMP signaling by inhibiting 
the activity of SARA. SARA activity may be inhibited by use of an antisense 
sequence to the SARA gene or by mutation of the SARA gene. 



30 
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Summary of the Drawings 

Certain embodiments of the invention are described, reference being 
made to the accompanying drawings, wherein: 

Figure 1 (top panel) shows interaction of full length hSARAI with 
5 bacterially expressed Smads. Full length SARA protein was produced in an in 
vitro transcription/translation system in the presence of [ 35 S]methionine and was 
incubated with glutathione-sepharose beads coated with bacterial ly-expressed 
GST fusion proteins of the indicated Smads or Smad2 subdomains. Bound 
material was resolved by SDS-PAGE and visualized by autoradiography. 

10 Migration of full length hSARAI , and a translation product that initiates from an 
internal methionine located upstream of the Smad binding domain (asterisk) are 
indicated. The presence of approximately equivalent amounts of GST fusion 
proteins was confirmed by SDS-PAGE and coomassie staining of a protein 
aliquot (bottom panel). 

15 Figure 2 shows interaction of hSARA with Smads in mammalian cells. 

COS cells were transfected with Flag-tagged hSARAI (Fiag-SARA) either alone or 
together with the indicated Myc-tagged Smad constructs. For Smad6, an 
alternative version lacking the MH1 domain was used (Topper et aL, 1997). Cell 
lysates were subjected to an anti-Flag immunoprecipitation and coprecipitating 

20 Smads detected by immunoblotting with anti-Myc antibodies. The migration of 
anti-Flag heavy and light chains (IgG) are marked. To confirm efficient 
expression of hSARAI and the Smads, aliquots of total cell lysates were 
immunoblotted with the anti-Flag and anti-Myc antibodies (bottom panel). The 
migrations of hSARAI and the Smads are indicated. 

25 Figures 3-6 show immunoblots of lysates from COS cells transiently 

transfected with various combinations of Flag or Myc-tagged hSARAI , wild type 
(WT) or mutant (2SA) Myc or Flag-tagged Smad2, Smad4/HA and wild type (WT) 
or constitutively active (A) TpRI/HA, cell lysates being subjected to 
immunoprecipitation with anti-Flag or anti-Myc antibodies, as indicated. 

30 Confirmation of protein expression was performed by immunoblotting total cell 
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lysates prepared in parallel for the indicated tagged protein (totals, bottom 
panels). 

Figure 3: Transfected cells were metabolically labelled with [ 32 P]P0 4 and 
cell lysates subjected to immunoprecipitation with anti-Flag antibodies for 
5 visualization of hSARAI phosphorylation (top panel) or with anti-Myc antibodies 
for Smad2 phosphorylation (middle panel). Immunoprecipitates were resolved 
by SDS-PAGE and visualized by autoradiography. The migrations of hSARAI 
and Smad2 are indicated. 

Figure 4: Lysates from transiently transfected COS cells were subjected to 
10 immunoprecipitation with anti-Flag antibodies and Smad2 bound to hSARAI 
was analyzed by immunoblotting with anti-Myc antibodies (IP: oe-flag; blot: a- 
Myc). 

Figure 5: Lysates from transiently transfected COS cells were subjected to 
immunoprecipitation with anti-Flag antibodies and Smad2 bound to hSARAI 

1 5 was analyzed by immunoblotting with anti-Myc antibodies (IP: a-flag, blot: a- 
Myc). Partial dissociation of hSARAl/Smad2 complexes induced by TGFp 
signaling was enhanced by expression of Smad4. 

Figure 6: Cell lysates from transiently transfected COS cells were 
subjected to immunoprecipitation with anti-Flag antibodies directed towards 

20 Smad2. Immunoprecipitates were then immunoblotted using anti-Myc or anti- 
HA antibodies which recognize hSARAI or Smad4, respectively. 
Coprecipitating SARA (a-myc blot) and Smad4 (ct-HA blot) are indicated. 

Figure 7, panels A to E, shows photomicrographs of Mv1 Lu cells 
transiently transfected with various combinations of Flag-Smad2, Myc-hSARA1, 

25 and constitutively active TpRI (TpRI*) as indicated (Tx). hSARA was visualized 
with the polyclonal Myc A14 antibody and Texas-Red conjugated goat-anti- 
rabbit IgG (red) and Smad2 was detected with an anti-Flag M2 monoclonal 
antibody followed by FITC-conjugated goat anti-mouse IgG (green). The 
subcellular localization of the expressed proteins was visualized by 

30 immunofluorescence and confocal microscopy. 
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Panels A, B, C, Mv1 Lu cells singly transfected with hSARAI (A) or Smad2 
(B) are shown. Cotransfection of Smad2 with the constitutively active TpRI 
(TpRI*) results in its accumulation in the nucleus (C). 

Panel D, MvlLu cells were transfected with hSARAI and Smad2 and the 
5 localization of hSARAI (red, left photo) and Smad2 (green, centre photo) is 
shown. Colocalization of SARA and Smad2 is shown (right photo) and appears 
as yellow. 

Panel E, Mvl Lu cells were transfected with hSARAI, Smad2 and activated 
TpRI (TpRI*) and the localization of hSARA (red, left photo) and Smad2 (green, 

10 centre photo) is shown. Colocalization of SARA and Smad2 is indicated (right 
photo). Note the shift to an orangy-red colour in the punctate spots and an 
intensification of Srnad2 nuclear staining, indicative of dissociation of Smad2 
from SARA and nuclear translocation. 

Figure 7, panel F, shows photomicrographs of MvlLu cells stained with 

1 5 rabbit, polyclonal anti-SARA antibody (left photo, green), goat, polyclonal anti- 
Smad 2/3 antibody (centre photo, red) and with both antibodies (right photo, 
yellow), showing co-localization of hSARAI and Smad2. 

Figure 8A shows photomicrographs of Mvl Lu cells transfected with either 
hSARAI alone (panel i), TpRll alone (panel ii) or hSARAI and TpRII together 

20 (panel iii), then treated with TGFp and the localization of hSARAI (red) and 

TpRII (green) determined by immunofluorescence and confocal microscopy. In 
cells coexpressing hSARAI and TpRII, superimposing the staining revealed 
colocalization of the proteins as indicated by yellow staining in panel iii. 

Figure 8B shows affinity labelling of COS cells transiently transfected with 

25 various combinations of Flag-hSARA1 , Myc-Smad2, wild type (WT) TpRII and 
either wild type or kinase-deficient (KR) versions of TpRI. Cells were affinity- 
labelled with [ 125 l]TGFp and lysates immunoprecipitated with anti-Flag 
antibodies. Coprecipitating receptor complexes were visualized by SDS-PAGE 
and autoradiography. Equivalent receptor expression was confirmed by 

30 visualizing aliquots of total cell lysates (bottom panel). 
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Figure 9A shows COS cells transiently transfected with wild type TpRII 
and kinase-deficient TpRI and various combinations of wild type Flag-hSARA1 
(WT), a mutant version lacking the Smad2 binding domain (ASBD) and Myc- 
Smad2. The amount of receptor bound to SARA was determined by anti-Flag 
5 immunoprecipitation followed by gamma counting. Data is plotted as the 
average of three experiments ±S.D. Protein expression was analyzed by 
immunoblotting aliquots of total cell lysates and the results from a representative 
experiment are shown (bottom panel). 

Figure 9B shows COS cells transiently transfected with wild type TpRII 
1 0 and kinase-deficient TpRI and Flag-tagged wild type (WT) or mutant versions of 
hSARAI with (black bars) or without (open bars) Myc-Smad2. The amount of 
receptor bound to hSARAI was determined by anti-Flag immunoprecipitation 
followed by gamma counting. Protein expression was analyzed by 
immunoblotting aliquots of total cell lysates (bottom panel). 
1 5 Figure 1 0 is a schematic representation of mutant versions of SARA. The 

FYVE domain (shaded bar) and the Smad binding domain, SBD (striped bar), are 
indicated. COS cells transiently transfected with Flag-hSARA1 and Myc-Smad2 
were immunoprecipitated with anti-Flag antibodies followed by immunoblotting 
with anti-Myc antibodies. The presence ( + ) or absence (-) of a hSARAI /Smad2 
20 interaction is indicated (Smad2 interaction). Mutants used for the subsequent 
localization study are marked on the left (i-vi). 

Figure 1 1 A shows an immunoblot of lysates from COS cells expressing 
Flag-tagged Smad2 or Smad3 incubated with GST alone or with GST-hSARA1 
(665-750), which corresponds to the SBD; bound proteins were immunoblotted 
25 using anti-Flag antibodies. The presence of Smad2 and Smad3 bound to GST- 
hSARAI (665-750) is indicated. 

Figure 1 1 B shows an immunoblot of lysates, from COS cells expressing 
Flag-tagged Smad2 together with wild type (WT) or activated (A) type I receptor, 
incubated with GST-hSARA1 (665-750) (GST-SBD) and immunoblotted with anti- 
30 Flag antibodies. The expression levels of Smad2, each receptor and GST- 
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hSARAI (665-750) were determined by immunoblotting aliquots of total cell 
ly sates. 

Figure 12 shows the subcellular localization of hSARAI mutants. MvlLu 
cells were transiently transfected with wild type (panel i) or mutant versions of 
5 Flag-hSARA1 (panels ii-viii, as marked on the left in Figure 10). Proteins were 
visualized by immunofluorescence and confocal microscopy using a monoclonal 
anti-Flag M2 monoclonal antibody followed by FITC-conjugated goat anti-mouse 
IgG. 

Figure 13 shows photomicrographs of Mv1 Lu cells transiently transfected 
10 with mutant versions of Myc-hSARAI and Flag-Smad2 (panel A) or with wild 
type Myc-hSARAI, HA-Smad2 and mutant versions of hSARAI (panel B). 
Protein subcellular localization was visualized by immunofluorescence and 
confocal microscopy. hSARAI was visualized with the polyclonal Myc A14 
antibody and FITC-conjugated goat anti-rabbit IgG (green), while Smad2 was 
1 5 detected with monoclonal antibodies followed by Texas Red-conjugated goat 
anti-mouse IgG (red). In B, overlaying the images reveals mislocalization of 
Smad2 as green speckles of SARA over red, diffuse Smad2 staining (panels ii and 
iii) and colocalization of hSARAI and Smad2 appears as yellow spots (panels i 
and iv). 

20 Figure 14 shows luciferase activity of Mv1 Lu cells transfected with 3TP- 

lux alone or together with the indicated amounts of wild type (WT) or mutant 
(A1-664 or A1-704) versions of hSARAI and incubated in the presence (black 
bars) or absence (open bars) of TGFp. Luciferase activity was normalized to p- 
galactosidase activity and is plotted as the mean ±S.D. of triplicates from a 

25 representative experiment. 

Figure 15 shows luciferase activity of HepG2 cells transfected with ARE- 
Lux alone (v), or ARE-Lux and FAST2 alone or together with the indicated 
amounts of wild type (WT) or mutant versions of hSARAI . Transfected cells 
were incubated in the presence (black bars) or absence (open bars) of TGFp and 

30 luciferase activity was normalized to p-galactosidase activity and is plotted as the 
mean ±S.D. of triplicates from a representative experiment. 
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Figure 16 shows a Northern blot of expression of hSARAI (upper panel) 
and Smad2 (lower panel) in the indicated tissues. 

Figure 1 7 shows an immunoblot of a HepG2 lysate immunoprecipitated 
(IP) with preimmune serum (PI), anti-hSARA1 polyclonal antibody (SARA) with 
5 and without pretreatment with TGFp (- and +), or N19 anti-Smad2/3 antibody 
(S2), followed by immunoblotting with an anti-Smad2 antibody. The migration 
position of Smad2 is indicated (Smad2). 

Figure 18 shows a diagram of a model of the interaction of a SARA 
protein with a receptor regulated Smad, as exemplified by the interaction of 
10 hSARAI . 

Detailed Description of the Invention 

This invention provides a family of proteins that play key roles in TGF-p, 
activin and bone morphogenetic protein (BMP) signal transduction pathways. In 

1 5 particular, the proteins of this family interact with specific Smad proteins to 

modulate signal transduction. These proteins are therefore designated as "Smad 
Anchor for Receptor Activation" or "SARA" proteins. SARA proteins are 
characterised by three distinct domains (1) a double zinc finger or FYVE domain 
responsible for the subcellular localization of the SARA protein or SARA-Smad 

20 complex, possibly through its association with Ptdlns(3)P, (2) a Smad binding 
domain ("SBD") which mediates the interaction or binding of one or more 
species of Smad protein with the particular member of the SARA family and (3) a 
carboxy terminal domain which mediates interaction of SARA with members of 
the TGFp superfamily of receptors. 

25 FYVE domains have been identified in a number of unrelated signaling 

molecules that include FGD1, a putative guanine exchange factor for Rho/Rac 
that is mutated in faciogential dysplasia, the HGF receptor substrate Hrs-1 and its 
homolog Hrs-2, EEA1, a protein involved in formation of the early endosome 
and the yeast proteins FAB1 , VPS27 and VAC1 (reviewed in Wiedemann and 

30 Cockcroft, 1 998). Recently, analysis of a number of FYVE domains from yeast 
and mammals has revealed that this motif binds phosphatidyl inositol-3- 
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phosphate (Ptdlns{3)P) with high specificity and thus represents a novel signaling 
module that can mediate protein interaction with membranes (Burd and Emr, 
1998; Gaullieret al., 1998; Patki etal., 1998; Simonsen etal., 1998; 
Wiedemann and Cockcroft, 1998). Comparison of the FYVE domains from the 
5 vertebrate proteins with that from SARA revealed extensive conservation of 
residues throughout the domain (Table 10). Thus, SARA contains a FYVE 
domain that may function to bind Ptdlns(3)P, which has been implicated in 
intracellular vesicle transport. 

For example, deletion of the FYVE domain in hSARAI causes 
1 0 mislocalization of Smad2 or Smad3, interferes with TGFp receptor interaction 
and inhibits TGFp-dependent transcriptional responses. 

Thus, the SARA proteins of the invention define a component of TGFp 
superfamily signaling that fulfills an essential role in anchoring receptor 
regulated Smads to specific subcellular domains for activation by a TGFp 
1 5 superfamily receptor. 

Cloned DNA coding sequences and corresponding amino acid sequences 
for representative human and Xenopus SARA protein family members are shown 
in the Tables, as follows: 

Tables 1 and 2 - human SARA1 (hSARAI) cDNA (Sequence ID NO:1) 
20 and amino acid sequence (Sequence ID NO:2) respectively; 

Tables 3 and 4 - human SARA2 (hSARA2) cDNA (Sequence ID NO:3) 
and amino acid sequence (Sequence ID NO:4) respectively; 

Tables 5 and 6 - Xenopus SARA1 (XSARA1) cDNA (Sequence ID NO:5) 
and amino acid sequence (Sequence ID NO:6) respectively; and 
25 Tables 7 and 8 - Xenopus SARA2 (XSARA2) cDNA (Sequence ID NO:7) 

and amino acid sequence (Sequence ID NO:8) respectively. 

Table 9 shows a comparison of the amino acid sequences of XSARA1 and 
hSARAI . Identical residues (dark grey) and conservative changes (light grey), 
the FYVE domain (solid underline) and the Smad binding domain (dashed 
30 underline) are indicated. The sequences in XSARA1 used to design degenerate 
PCR primers for identifying hSARAI are shown (arrows). The amino-terminal 



WO 00/05360 PCT/CA99/00656 

18 

end of the partial Xenopus cDNA obtained in the expression screen is marked 
(asterisk). 

The human SARA of Tables 1 and 2, identified as described in Example 2, 
regulates the subcellular localization of Smad2 and Smad3 and recruits these 
5 Smads into distinct subcellular domains. This SARA also interacts with TGFp 
receptors and TGFp signaling induces dissociation of Smad2 or Smad3 from the 
SARA protein with concomitant formation of Smad2/Smad4 complexes and 
nuclear translocation. 

Table 10 shows alignment of the amino acid sequences of the FYVE 
10 domains from hSARAI, XSARA1, KIAA0305, FGD1, Hrs-1, Hrs-2 and EEA1. 

Identical residues (dark grey) and conservative changes (light grey) are marked. 
A consensus sequence (bottom) was derived from positions in which at least 6 
out of 7 residues were conserved or when proteins contained one of only two 
alternate residues. 

1 5 The regulation of the subcellular localization of components of signaling 

pathways can be key determinants in the effective initiation and maintenance of 
signaling cascades. Targeting of signal transduction proteins to specific 
subcellular regions is highly regulated, often through specific interactions with 
scaffolding or anchoring proteins (Faux and Scott, 1996; Pawson and Scott, 

20 1997). Scaffolding proteins have been defined as proteins that bind to multiple 
kinases to coordinate the assembly of a cascade, while anchoring proteins are 
tethered to specific subcellular regions in the cell and can act to bring together 
components of a pathway. Regulating location of signaling components can 
thus coordinate the activity of a signaling network, maintain signaling specificity 

25 or facilitate activation of a pathway by localizing kinases together with their 
downstream substrates. 

As described herein, a recombinantly produced human SARA protein 
bound directly and specifically to unphosphorylated Smad2 and Smad3. In 
addition, receptor-dependent phosphorylation induced Smad2 to dissociate from 

30 SARA, bind to Smad4 and translocate to the nucleus. Thus, the hSARAI protein 
functions in TGFp signaling upstream of Smad activation to recruit Smad2 to the 



WO 00/05360 



PCT/CA99/00656 



19 

TGFp receptor by mediating the specific subcellular localization of Smad and by 
associating with the TGFp receptor complex. Furthermore; inducing 
mislocalization of Smad2 by expressing a mutant of the hSARAI protein blocks 
TGFp-dependent transcriptional responses, indicating an essential role for SARA- 
5 mediated localization of Smads in signaling. Together, these results identify the 
cloned hSARAI protein as a novel component of the TGFp pathway that 
functions to anchor Smad2 to specific subcellular sites for activation by the TGFp 
receptor kinase. 

In vitro, receptor-regulated Smads are recognized by the receptor kinases 

10 and are phosphorylated on the C-terminal SSXS motif (Abdollah et al., 1997; 

Kretzschmar et al., 1997; Macias-Silva et al., 1996; Souchelnytskyi et al., 1997). 
This phosphorylation is similar to receptor-dependent phosphorylation in 
mammalian cells, suggesting that SARA is not absolutely required for recognition 
of Smads by the receptor complex. In intact cells, however, receptor-regulated 

1 5 Smads are cytosolic proteins that require activation by transmembrane 
serine/threonine kinase receptors. Consequently, Smads may require 
recruitment by SARA to interact with TGFp superfamily receptors. Domains in 
which SARA is found correspond to regions where TGFp receptors are also 
localised. TGFp receptors display regionalized localization and hSARAI recruits 

20 Smad2 to these domains. The identity of these intracellular domains is unclear. 
However, they contain receptors and recent evidence has shown that FYVE 
finger domains interact with membranes, so it is reasonable to suggest that these 
domains represent membrane vesicles. Thus, clustering of the TGFp receptor, as 
previously described by Henis et al. (1994), may function to direct the receptor 

25 to hSARAI and the Smad2 substrate. This activity may be most critical in vivo, 
where ser/thr kinase receptors are often found in low numbers and only a small 
proportion need to be activated for biological responses (Dyson and Gurdon, 
1998). This activity is likely to be most critical in vivo, where ser/thr kinase 
receptors are often found in low numbers and only a small proportion need to be 

30 activated for biological responses (Dyson and Gurdon, 1998). This may impose 
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on the pathway a stringent requirement for SARA to anchor Smads in these sites 
for receptor interaction. 

The colocalization and association of hSARAI with the TGFp receptor 
defines a role for hSARAI in recruiting Smad2 to the receptor kinase. 
5 Furthermore, deletion of the FYVE domain interferes with receptor binding, 
prevents the correct localization of hSARA1/Smad2 and blocks TGFp signaling 
(see Example 8 below), suggesting that this is an important function in the 
pathway. Interestingly, the binding of the hSARAI protein identified in Example 
2 to the receptor was enhanced upon Smad2 expression and, on its own, SARA 
10 may interact inefficiently with the receptor. However, within the hSARA1/Smad 
complex, Smad2 might help drive association with the receptor through its 
recognition of the catalytic region of the kinase domain. Consistent with this, 
cooperation requires a kinase deficient type I receptor which also traps the 
Smad2 substrate (Macias-Silva et ah, 1996). Thus, Smad2 may bind to the 
1 5 catalytic pocket of the type I kinase domain while hSARAI , which is not a 
substrate of the kinase, may interact with regions outside of the domain. 

The human SARA protein identified in Example 2 did not interact with 
any of the other Smads tested, indicating that it functions specifically in Smad2 
and Smad3 pathways (see Example 3). However, Smad5 localization in 293 
20 cells displayed a remarkably similar pattern to that of this SARA protein 

(Nishimura et al., 1998) and similar patterns were observed for endogenous 
Smadl or 5 in the kidney epithelial cell line, IMCD-3. Thus, localization of 
BMP-regulated Smads (for example, Smadl, SmadS and Smad8) may also be 
regulated by a specific SARA family member. 
25 The genes for two other SARA family member proteins were also 

identified and cloned. One of these, identified in Xenopus and designated 
XSARA2 (Tables 7 and 8), is related to XSARA1, while the other one, hSARA2 
(Tables 3 and 4), is a human clone, related to the hSARA! of Tables 1 and 2. 
This second human clone has been identified in EST clone KIAA0305. A 
30 comparison of the SBD from hSARAI with a similar region from the KIAA0305 
sequence indicated that the amino terminal half of the region of the SBD was 
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highly divergent from the amino acid sequence encoded by KIAA0305. This 
suggests that the protein encoded by KIAA0305 may mediate binding with other 
as yet unidentified proteins, eg. other Smads. In contrast to the SBD, the FYVE 
domain of the KIAA0305 protein is more closely related to the hSARAI FYVE 
5 domain (70% identity), suggesting that this protein may be an anchor for other 
Smad proteins that function either in the TGFp pathway or in other signaling 
cascades, such as the BMP signal transduction pathway. 



SARA is not limiting in Smad activation and TGFJJ superfamily signaling 

1 o It was observed that elevating Smad2 levels can saturate hSARAI and 

yield a diffuse distribution for Smad2. Thus, the level of the hSARAI protein is a 
key determinant in controlling Smad2 localization. As a consequence, 
endogenous Smad2 may or may not display a hSARA1-like distribution, 
depending on the relative expression of the two proteins. Indeed, in Mvl Lu 
1 5 cells, endogenous Smad2 displays a punctate pattern with some diffuse staining 
in the cytosol. While not meaning to limit the invention to a particular 
mechanism, the data are consistent with the view that once signaling has 
commenced, Smad2 dissociates from hSARAI, binds to Smad4 and translocates 
to the nucleus, freeing hSARAI to recruit additional Smad2 from the cytosolic 
20 reservoir. This would provide a mechanism to allow quantitative activation of 
Smads in the presence of high levels of TGFp signaling. 

By functioning to recruit Smad2 to the TGFp receptor, hSARAI is located 
in an important regulatory position in the pathway. Thus, control of hSARAI 
localization or protein levels, or its interaction with Smad2, could modulate 
25 TGFP signaling. Further, disruption of normal hSARAI function could 

potentially be involved in loss of TGFp responsiveness that is a common feature 
during tumour progression. 

Modular Domains in SARA 

30 The function of hSARAI in TGFp signaling is mediated by three 

independent domains, the Smad binding domain (SBD) that mediates specific 
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interaction with Smad2 and Smad3, the FYVE domain that targets 
hSARA1/Smad2 to specific subcellular sites and the carboxy terminus which 
mediates association with the TGFp receptor. The Xenopus and mouse 
forkhead-containing DNA binding proteins, FAST1 and FAST2, bind specifically 
5 to Smad2 and Smad3 and like hSARAI, interact with the MH2 domains (Chen et 
al., 1996; Chen etal., 1997a; Labbe et al., 1998; Liu et al., 1997a). 
Comparison of the SBD from this SARA with the Smad Interaction Domain (SID) 
from these FAST proteins revealed no regions of obvious similarity. However, 
since hSARAI acts upstream and FAST downstream of Smad activation, these 
1 0 proteins may employ structurally unrelated domains to distinguish unactivated 
versus activated forms of Smad2. Thus, the SBD of this SARA protein 
preferentially binds unphosphorylated monomeric Smad2 while the SID from 
FAST must bind phosphorylated Smad2 in heteromeric complexes with Smad4. 
By analogy, the SBD of other SARA family members may bind the 
1 5 unphosphorylated monomeric species of other Smads that mediate signal 
transduction in other pathways (eg. Smads 1 , 5 or 8 in the BMP signal 
transduction pathway). 

In hSARAI, the FYVE domain functions independently of the SBD, to 
mediate the subcellular targetting of the protein. The FYVE-finger motif has now 
20 been identified in at least 30 proteins from diverse species, such as FGD1 , Hrs-1 
and 2, and EEA1 (Gaullier et ah, 1998; Wiedemann and Cockcroft, 1998). 
Recent advances have demonstrated that FYVE finger motifs from a variety of 
divergent proteins have a conserved function and bind phosphatidyl inositol-3- 
phosphate (Ptdlns(3)P) with high specificity (Burd et aL, (1998); Patki (1998); 
25 Gaullier (1998)). Through this interaction, the FYVE finger can mediate protein 
interactions with phospholipid bilayers. However, Ptdlns(3)P is present 
ubiquitously on cell membranes and in the case of EEA1, further protein-protein 
interactions with Rab5-GTP are required in addition to the FYVE domain to 
target the protein to the correct membranes (Simonsen et al., 1998). Given that 
30 Ptdlns(3)P binding by FYVE fingers is conserved in yeast and mammals, it is 
likely that the FYVE finger of hSARAI similarly mediates interaction with the 
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membrane. Furthermore, it is possible that additional protein-protein 
interactions may be required to direct hSARAI to regions that contain the TGFp 
receptors. The carboxy terminus of hSARAI, which is required for efficient 
interaction with the TGFp receptor, may function in this capacity. 

5 Together, these data define discrete domains in SARA that fulfill specific 

aspects of SARA function in TGFp superfamiiy signaling. Without being limited 
to any particular mechanism, a possible model of the interaction of SARA with a 
receptor regulated Smad in TGFp superfamiiy signaling, as exemplified by 
hSARAI and its interactions with Smad2 in TGFp signaling, is shown 

10 diagrammatically in Figure 18. The FYVE domain likely functions to direct SARA 
to the membrane, perhaps through interactions with Ptlns(3)P. It thus fulfills an 
important role in recruiting hSARAI to specific subcellular domains that have 
been shown also to contain the TGFp receptor. The SBD in turn functions to 
bind unactivated Smad2, thus recruiting the receptor substrate to this subcellular 

1 5 region. Once localized to this region, the C-terminal domain of hSARAI 

functions with Smad2 bound to the SBD to promote interaction with the receptor 
complex. These three domains thus function cooperatively to recruit Smad2 to 
the TGFp receptor. 



20 Additional Roles for SARA 

Controlling the localization of kinases and their substrates may allow not 
only for efficient recognition and phosphorylation but may also function to 
maintain specificity and suppress crosstalk between signaling pathways. Thus, 
by controlling Smad localization, a SARA family member protein could 

25 additionally function to maintain the highly specific regulation of Smad 

phosphorylation by ser/thr kinase receptors that is observed in vivo and could 
prevent promiscuous phosphorylation by other kinases in the cell. Furthermore, 
through its interactions with a particular receptor, a SARA protein might function 
to control the activity or turnover of the receptor complex. Alternatively, SARA 

30 may also fulfill scaffolding functions to coordinate the receptor-dependent 
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activation of Smads with other as yet unidentified components of a signaling 
pathway. 

Nucleic Acids 

5 In accordance with one series of embodiments, the present invention 

provides isolated nucleic acids corresponding to, or related to, the human and 
Xenopus SARA nucleic acid sequences disclosed herein. In addition to the 
SARA nucleotide sequences disclosed herein, one of ordinary skill in the art is 
now enabled to identify and isolate homologues of the SARA genes described 

1 0 herein. One of ordinary skill in the art may screen preparations of genomic or 
cDNA from other species using probes or PCR primers derived from nucleotide 
sequences disclosed herein. In accordance with a further embodiment, the 
invention provides isolated nucleic acids of at least 10 consecutive nucleotides, 
preferably 15 consecutive nucleotides, more preferably 20 consecutive 

1 5 nucleotides of Sequences ID NO:1, Sequence ID NO:3, Sequence ID NO:5 and 
Sequence ID NO:7, up to the complete sequences. Short stretches of nucleotide 
sequence are useful as probes or primers useful for identification or amplification 
of the nucleic acids of the invention or for encoding fragments, functional 
domains or antigenic determinants of SARA proteins. 

20 The invention also includes polynucleotides which are complementary to 

the disclosed sequences, polynucleotides which hybridise to these sequences at 
high stringency and degeneracy equivalents of these sequences. 

Proteins 

25 SARA proteins may be produced by culturing a host cell transformed with 

a DNA sequence encoding a selected SARA protein. The DNA sequence is 
operatively linked to an expression control sequence in a recombinant vector so 
that the protein may be expressed. 

Host cells which may be transfected with the vectors of the invention 

30 may be selected from the group consisting of E. coli, Pseudomonas, Bacillus 
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subtittus, or other bacilli, yeasts, fungi, insect cells or mammalian cells including 
human cells. 

For transformation of a mammalian cell for expression of a SARA protein, 
the vector may be delivered to the cells by a suitable vehicle. Such vehicles 
5 including vaccinia virus, adenovirus, retrovirus, Herpes simplex virus and other 
vector systems known to those of skill in the art. 

A SARA protein may also be recombinantly expressed as a fusion protein. 
For example, the SARA cDNA sequence is inserted into a vector which contains 
a nucleotide sequence encoding another peptide (e.g. GST-glutathione succinyl 
10 transferase). The fusion protein is expressed and recovered from prokaryotic 

(e.g. bacterial or baculovirus) or eukaryotic cells. The fusion protein can then be 
purified by affinity chromatography based upon the fusion vector sequence and 
the SARA protein obtained by enzymatic cleavage of the fusion protein. 

The protein may also be produced by conventional chemical synthetic 
1 5 methods, as understood by those skilled in the art. 

SARA proteins may also be isolated from cells or tissues, including 
mammalian cells or tissues, in which the protein is normally expressed. 

The protein may be purified by conventional purification methods known 
to those in the art, such as chromatography methods, high performance liquid 
20 chromatography methods or precipitation. 

For example, anti-SARA antibodies may be used to isolate SARA protein 
which is then purified by standard methods. 

Antibodies 

25 The provision of the polynucleotide and amino acid sequences of SARA 

proteins provides for the production of antibodies which bind selectively to a 
SARA protein or to fragments thereof. The term "antibodies" includes 
polyclonal antibodies, monoclonal antibodies, single chain antibodies and 
fragments thereof such as Fab fragments. 

30 In order to prepare polyclonal antibodies, fusion proteins containing 

defined portions or all of a SARA protein can be synthesized in bacteria by 
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expression of the corresponding DNA sequences, as described above. Fusion 
proteins are commonly used as a source of antigen for producing antibodies. 
Alternatively, the protein may be isolated and purified from the recombinant 
expression culture and used as source of antigen. Either the entire protein or 
5 fragments thereof can be used as a source of antigen to produce antibodies. 

The purified protein is mixed with Freund's adjuvant and injected into 
rabbits or other appropriate laboratory animals. Following booster injections at 
weekly intervals, the animals are then bled and the serum isolated. The serum 
may be used directly or purified by various methods including affinity 

1 0 chromatography to give polyclonal antibodies. 

Alternatively, synthetic peptides can be made corresponding to antigenic 
portions of a SARA protein and these may be used to inoculate the animals. 

In a further embodiment, monoclonal anti-SARA antibodies may be 
produced by methods well known in the art. Briefly, the purified protein or 

1 5 fragment thereof is injected in Freund's adjuvant into mice over a suitable period 
of time, spleen cells are harvested and these are fused with a permanently 
growing myeloma partner and the resultant hybridomas are screened to identify 
cells producing the desired antibody. Suitable methods for antibody preparation 
may be found in standard texts such as Antibody Engineering, 2d. edition, 

20 Barreback, ED., Oxford University Press, (1995). 



Transgenic animals 

In accordance with a further embodiment, the invention provides for the 
production of transgenic non-human animals which afford models for further 
25 study of the SARA family of proteins and also provide tools for the screening of 
the candidate compounds as therapeutics. 

Animal species which are suitable for use include rats, mice, hamsters, 
guinea pigs, rabbits, dogs, cats, goats, sheep, pigs and non-human primates. 
In accordance with one embodiment, a transgenic animal may be 
30 prepared carrying a heterologous SARA gene by inserting the gene into a germ 
line or stem cell using standard technique of oocyte microinjection, or 
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transfection or microinjection into embryonic stem cells. The techniques of 
generating transgenic animals are now well known and fully described in the 
literature. For example, a laboratory manual in the manipulation of the mouse 
embryo describes standard laboratory techniques for the production of 
5 transgenic mice (Hogan et aL (1986), Manipulating the Mouse Embryo, Cold 
Spring Harbour Laboratory Press, Cold Spring Harbour, New York). 

In accordance with a further embodiment, the invention enables the 
inactivation or replacement of an endogenous SARA gene in an animal by 
homologous recombination. Such techniques are also fully described in the 
10 literature. Such techniques produce "knock-out" animals, with an inactivated 
gene, or "knock-in" animals, with a replaced gene. 



EXAMPLES 

The examples are described for the purposes of illustration and are not 
1 5 intended to limit in any way the scope of the invention. 

Methods of molecular genetics, protein and peptide biochemistry and 
immunology referred to but not explicitly described in this disclosure and 
examples are reported in the scientific literature and are well known to those 
skilled in the art. 

20 

Example 1; Methods 

Isolation of Xenopus and human SARA 

To prepare a probe for library screening, the MH2 domain of Smad2 
(amino acids 241-467) was subcloned into a modified pGEX4T-1 vector 

25 containing the protein kinase A recognition site derived from pCEX2TK 

(Pharmacia). This bacterial fusion protein was purified, labelled with [ 32 P]yATP 
and used as probe to screen a XZAP II Xenopus dorsal lip library as described 
(Chen and Sudol, 1995). A screen of 1 x 10 6 plaques yielded four phage which 
represented repeated isolates of the same clone. This partial cDNA contained a 

30 2.1 kb open reading frame and 1 kb of 3' untranslated region (UTR). A full 

length clone was obtained by a combination of rescreening of the same dorsal 
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lip library using a 670 base pair EcoRI/Hpal fragment at the 5' end of this clone 
and by 5' RACE (Gibco/BRL) using stage 10 Xenopus RNA. 

To obtain a human homolog of Xenopus SARA, cDNA was synthesized 
from randomly primed total RNA isolated from HepC2 cells. This cDNA was 
5 subjected to polymerase chain reaction (PCR) using degenerate primers as 

described previously (Attisano et al., 1992). The 5' and 3' primers, designed to 
encode the zinc-finger motif, correspond to 

GC(A/C/GAr/)CC(A/C/CmAA(CAT)TG(C/TATGAA(A/C/G/T)TG(C/T) and 
(A/G)CA(A/G)TA(C/T)TC(A/C/G/T)GC(A/C/G/T)GG(A/G)TT(A/G)TT / respectively. 

10 A 1 50 base pair PCR product was sequenced and then used as probe for 

screening a XZAP human fetal brain cDNA library (Stratagene). Eight positive 
plaques were obtained, two of which contained an overlap of approximately 1kb 
and covered the entire open reading frame. The sequence of the 5' UTR was 
confirmed by sequencing of an expressed sequence tag database clone (clone ID 

15 260739). 

Construction of Plasmtds 

For mammalian expression constructs of SARA, the open reading frame of 
hSARA was amplified by PCR and was subcloned into pCMV5 in frame with an 

20 ami no-terminal Flag or Myc tag (Hoodless et al., 1 996). The deletion mutants of 
pCMV5-Flag-hSaraA893-1323, A346-132, A893-1323, and A346-1323 were 
constructed by deletion of EcoRV-Hindlll, Xbal-Hindlll, Sall-EcoRV, and Sali-Xbal 
fragments, respectively. PCMV5-Flag-hSaraA1-594 and A1-686 were obtained 
by partial digestion with Asp718/Sall and for pCMV5-Flag-hSARA A665-1323 a 

25 Asp718/Hindlll partial digest was used. PCMV5-Flag-hSARAA596-704 was 
constructed by deleting Asp718 fragment. The other hSARA mutants were 
constructed by PCR using appropriate primers. PCMV5B-Myc-Smad3 and Myc- 
Smad6, pGEX4T-1-Smad2/MH1 (amino acids 1-181), pCEX4T-1-Smad2/linker 
(amino acids 186-273), pGEX4T-1-Smad2/MH2 (amino acids 241-467) and 

30 pGEX4T-1-h SARA (amino acids 665-750) were constructed by PCR. 
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In Vitro Protein Interactions 

In vitro transcription/translation reactions were performed using the TNT 
coupled reticulocyte lysate system (Promega) following the manufacturer's 
instructions using T3 RNA polymerase. Translation was carried out in the 
5 presence of [ 35 S]-methionine and labelled proteins were incubated with purified 
GST fusion proteins in TNTE buffer with 10% glycerol for 2 hours at 4°C and 
then washed five times with the same buffer. Bound protein was separated by 
SDS-PAGE and visualized by autoradiography. 

1 0 Immunoprecipitation and Immunoblotting 

COS-1 cells transfected with LipofectAMINE (GIBCO BRL) were lysed 
with lysis buffer (Wrana et aL, 1994) and subjected to immunoprecipitation with 
either anti-Flag M2 (IBI, Eastern Kodak) or anti-Myc (9E10) monoclonal antibody 
followed by adsorption to protein-G sepharose. Precipitates were separated by 
1 5 SDS-PAGE, transferred to nitrocellulose membranes and immunoblotted as 
described previously (Hoodless et al v 1996). 

Affinity-Labelling 

LipofectAMINE transfected COS-1 cells were incubated with 200 pM 
20 [ 125 l]TGFp in media containing 0.2% bovine fetal serum at 37°C for 30 minutes 
and receptors were cross-linked to ligand with DSS as described previously 
(Macias-Silva et al., 1996). Cell lysates were immunoprecipitated with anti-Flag 
antibody and receptors visualized by SDS-PAGE and autoradiography. In some 
cases, cross-linked [ 125 l]TGF(i was determined by gamma counting. 

25 

Subcellular Localization by Immunofluorescent Confocal Microscopy 

Mv1 Lu cells, plated on gelatin-coated Permanox chamber slides (Nunc), 
were transfected by the calcium phosphate-DNA precipitation method. Fixation, 
permeabilisation and reaction with the primary and secondary antibodies were 
30 described previously (Hoodless et aL, 1996). Monoclonal anti-Flag antibodies 
were visualized by FITC-conjugated goat anti-mouse IgG (Jackson Laboratories) 
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and polyclonal Myc antibody (A 14, Santa Cruz) was visualized with Texas-Red- 
conjugated goat anti-rabbit IgG (Jackson Laboratories), Immunofluorescence was 
analyzed on a Leica confocal microscope. 

5 Transcriptional Response Assay 

Mv1 Lu cells were transiently transfected with the reporter plasmid, p3TP- 
lux (Wrana et al., 1992), CMV-pgal and selected constructs using calcium 
phosphate transfection. Twenty-four hours after transfection, cells were 
incubated overnight with or without 50 pM TGFp. Luciferase activity was 
10 measured using the luciferase assay system (Promega) in a Berthold Lumat LB 
9501 luminometer and was normalized to (i-galactosidase activity. 

Example 2 - Identification of SARA family members 

The MH2 domain of Smad2 was fused to glutathione-S-transferase (GST) 

1 5 that included a kinase recognition site for protein kinase A (PKA). The 

bacterial ly-expressed fusion protein was labelled to high specific activity using 
PKA (Chen and Sudol, 1995), and then used to screen a AZAPII expression 
library prepared from the dorsal blastopore lip of Xenopus. From this screen, 
four clones were identified, all of which presented a repeated isolate of a partial 

20 cDNA clone with no similarity to sequences in the GenBank database. To 
confirm that the product encoded by this clone interacted with Smad2, an in 
vitro transcription/translation system was used to produce [ 35 S]methionine- 
labelled protein. Translation of the cDNA yielded a protein product of 
approximately 80 kDa which corresponded in size to the longest open reading 

25 frame (ORF) identified in the sequence. Incubation of this product with 
bacterial ly-produced GST-Smad2(MH2) resulted in efficient binding of the 
translated product to the fusion protein (data not shown). Interaction with full 
length Smad2 was also observed, whereas binding to bacterial ly-expressed 
Smadl or Smad4 was not. 

30 To isolate a full length cDNA, the partial clone identified in the 

interaction screen was used as a probe to rescreen the same blastopore lip 
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library. Since the resulting clones lacked the 5' end, 5' RACE was conducted to 
obtain the entire coding sequence. Analysis of the complete cDNA sequence 
(Table 5) revealed a long open reading frame that was contiguous with that of 
the partial clone. The predicted protein, XSARA1, is 1235 amino acids long with 
5 an estimated molecular mass of 135 kDa (Table 6). Analysis of the full length 
cDNA sequence (Table 9) revealed a region in the middle portion of the 
predicted protein that had similarity to a double zinc finger domain (recently 
renamed the FYVE domain; Mu et al., 1995). The FYVE domain has been 
identified in a number of unrelated signaling molecules that include FGD1, a 

10 putative guanine exchange factor for Rho/Rac that is mutated in faciogenital 
dysplasia (Pasteris et al., 1994), the HGF receptor substrate Hrs-1 and its 
homolog Hrs-2 (Bean et aL, 1997; Komada and Kitamura, 1995), EEA1, a protein 
involved in formation of the early endosome (Mu et ai., 1995) and the yeast 
proteins FABl , VPS27 and VAC1 (Piper etal., 1995;Weisman and Wickner, 

1 5 1 992; Yamamoto et al., 1 995). Comparison of the FYVE domains from the 
vertebrate proteins with that from SARA revealed extensive conservation of 
residues throughout the domain (Table 10). Thus, SARA contains a FYVE 
domain that may fulfill important functions in diverse proteins. 

To investigate the role of SARA in TGFp superfamily signaling in 

20 mammalian cells, a human homologue was identified. Using a carboxy-terminal 
portion of XSARA1, a human library was screened and a protein was identified 
that was distantly related to Xenopus SARA (34% identity) and which was also 
sequenced as an EST (KIAA0305). However, no homologs closer to XSARA 
were identified. Thus, degenerate oligonucleotide primers were designed 

25 encoding amino acids in XSARA 1 (Table 9) and HepG2 RNA was used as 

template for degenerate PCR. A related sequence was identified and this partial 
cDNA was used to screen a human brain cDNA library. Four overlapping 
clones, encoding a long open reading frame were identified and a search of the 
EST database with this sequence led to the identification of additional 

30 overlapping cDNA clones from libraries derived from T cells, uterus, endothelial 
cells and melanocytes. Analysis of the contiguous sequence revealed a long 
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open reading frame that had a consensus start codon preceded by stop codons in 
all three reading frames (Table 1). Comparison of the predicted protein hSARAI 
(Table 2), from this cDNA with XSARA1 (Table 9) revealed an overall identity of 
62%, with a divergent 558 residue amino terminal domain (35% identity) 
5 followed by a closely related carboxy terminus (85% identity). 

Example 3 - hSARA interacts specifically with Smad2 and Smad3 

To characterize the interaction of hSARA with Smads, the full length 
protein was translated in vitro and tested for binding to bacterial ly-expressed 

10 Smad fusion proteins. Similar to the Xenopus clone, hSARAI bound specifically 
to full length Smad2, but not Smadl or Smad4 (Figure 1). In addition, full length 
Smad3, which is highly related to Smad2, also interacted with hSARAI . To 
define the domains of Smad2 that bound hSARA, in bacteria various fragments of 
Smad2 corresponding to the MH1 domain, linker region and MH2 domain were 

1 5 expressed in bacteria. Similar to the Xenopus clone, hSARA interacted efficiently 
with fusion proteins that comprised the MH2 domain, while no association was 
detected between hSARA and either the MHt or non-conserved linker domains 
(Figure 1). Thus, hSARAI specifically interacts with Smad2 through the MH2 
domain. 

20 To confirm that hSARA also bound to Smads in mammalian cells, a Flag 

epitope tag was introduced at the amino terminus of the protein to create Flag- 
SARA. Transient expression of Flag- SARA in COS cells yielded a protein of the 
predicted molecular weight for SARA (Figure 2) that was not present in 
untransfected cells (data not shown). To investigate the interaction of SARA with 

25 Smads, Flag- SARA was expressed in COS cells together with Myc-tagged 

versions of Smads 1, 2, 3, 4, 6 and 7. Cell lysates were subjected to anti-Flag 
immunoprecipitation followed by immunoblotting with anti-Myc antibodies. In 
other immunoprecipitates of cells expressing either Smad2 or Smad3, efficient 
coprecipitation of either Smad with Flag- hSARAI was observed (Figure 2). In 

30 contrast, none of the other Smads coprecipitated with hSARAI . Specific binding 
of this SARA family member to both Smad2 and Smad3 is consistent with the 
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observation that these two proteins possess very closely related MH2 domains 
(97% identity) and are both activated by TGFp or activin type I receptors (Liu et 
a., 1997b; Macias-Silva et aL, 1996; Nakao et aL, 1997a). Together, these 
results demonstrate that this SARA family member is a specific partner for 
5 receptor-regulated Smads of the TGFp/activin signaling pathway. 

Example 4- Phosphorylation of Smad2 induces dissociation from SARA 

Previous findings have shown that activation of TGFp signaling results in 
phosphorylation of Smad2 or Smad3 by type I receptors on C-terminal serine 

10 residues (Liu et al., 1997b; Macias-Silva et aL, 1996). A constitutively active 

TGFp type I receptor was prepared by substituting a threonine in the GS domain 
with an aspartate residue (Wieser et aL, 1995). This activated type 1 receptor 
induces TGFp signaling in the absence of type II receptors and ligand and 
regulates the phosphorylation and activation of Smad proteins in a manner 

15 similar to ligand (Macias-Silva et al., 1996; Wieser et aL, 1995). COScellswere 
transfected with combinations of Smad2, hSARAI or both in the presence or 
absence of activated TpRL Cells were then metabolically labelled with 
[ 32 P]phosphate and phosphorylation of either hSARAI or Smad2 was assessed in 
immunoprecipitates. Analysis of SARA phosphorylation revealed that the protein 

20 was basal ly phosphorylated and the coexpression of the activated type I receptor 
did not appreciably affect the overall phosphorylation (Figure 3). In contrast, 
analysis of Smad2 immunoprecipitated from total cell lysates showed that the 
activated type I receptor induced strong phosphorylation of the protein as 
described previously (Macias-Silva et aL, 1996). These results suggest that SARA 

25 is not phosphorylated in response to TGFp signaling. 

The phosphorylation state of Smad2 that coprecipitated with hSARAI was 
examined. Interestingly, unlike the strong induction of Smad2 phosphorylation 
in the total cellular pool, phosphorylation of Smad2 associated with hSARAI was 
not enhanced, but rather appeared to decrease in the presence of TGFp signaling 

30 (Figure 3). This suggested that receptor-dependent phosphorylation of Smad2 
might induce dissociation from hSARAI. To examine this directly, the 
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interaction of hSARAI with wild type Smad 2 or a mutant version lacking the C- 
terminal phosphorylation sites (Srnad2(2SA)) was analysed, in the absence of 
TGFp signaling, association of hSARAI with either Smad2 or Smad2(2SA) was 
comparable (Figure 4). In contrast, in cells coexpressing the activated receptor, a 
5 significant decrease in the interaction of wild type Smad2 with hSARAI was 
observed. However, hSARAI /Smad2(2SA) complexes were not reduced by the 
activated receptor. Together, these results suggest that hSARAI is not 
phosphorylated in response to TGFp signaling and that it preferentially interacts 
with the unphosphorylated form of Smad2. 

10 

Example 5 - SARA and Smad4 form mutually exclusive complexes with Smad2 

Phosphorylation of Smad2 induces its interaction with Smad4 (Lagna et 
al., 1996; Zhang et al., 1997). hSARA1/Smad2 complexes in COS cells 
coexpressing Smad4 were assessed. In unstimulated cells, the level of 

1 5 hSARA1/Smad2 complex formation was comparable either in the presence or 
absence of Smad4 (Figure 5, lanes 3 and 6). However, upon activation of TGFP 
signaling, dissociation of Smad2 from hSARAI was significantly enhanced by 
coexpression of Smad4 (Figure 5, lanes 4 and 7). These results indicated that 
phosphorylated Smad2 might preferentially interact with Smad4 rather than 

20 hSARAI and suggested that Smad2 might form mutually exclusive complexes 
with either Smad4 or hSARAI . The formation of Smad2/Smad4 and 
Smad2/hSARA4 complexes in the same transfectants was then examined. Cell 
lysates were subjected to immunoprecipitation with anti-Flag antibodies directed 
towards tagged Smad2 and then immunoblotted for the presence of Smad4 and 

25 hSARAI . Consistent with previous findings (Lagna et al., 1 996; Zhang et aL, 

1997), interaction of Smad4 with Smad2 was strongly stimulated by the activated 
type I receptor (Figure 6, lane 3 and 4). Concomitant with the formation of 
Smad2/Smad4 complexes, the interaction of Smad2 with hSARAI was disrupted 
by activation of signaling (Figure 6, lanes 6 and 7). Thus, complexes of 

30 Smad2/hSARA1 and Smad2/Smad4 are mutually exclusive, supporting the notion 
that Smad4 may compete for Smad2 to enhance dissociation of hSARAI /Smad2 
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complexes. Together these results demonstrate that during TGFp signaling, 
hSARA1/Smad2 complexes are transient and phosphorylation of Smad2 induces 
dissociation and formation of heteromeric complexes with Smad4. 

5 Example 6- HSARA1 regulates the subcellular localization of Smad2 

The studies described above suggest that SARA functions upstream in the 
pathway and might control the subcellular localization of Smad2. To test this, 
an investigation was done to determine whether coexpression of hSARAI might 
alter the localization of Smad2 in the TGFp-responsive epithelial cell line, 

1 0 Mv1 Lu, using confocal microscopy. Mv1 Lu cells were used rather than COS 

since the Myc antibodies crossreacted with endogenous proteins in the COS and 
obscured nuclear staining of tagged proteins. In cells expressing hSARAI alone, 
the protein displayed a punctate staining pattern that was present throughout the 
cytosolic compartment and was excluded from the nucleus (Figure 7A). This 

1 5 localization of hSARAI was in contrast to the diffuse staining typically observed 
for Smad2 in cells overexpressing the protein (Figure 7B). Cells transiently 
transfected with both hSARAI and Smad2 were examined. In these cells, the 
distribution of hSARAI was indistinguishable from cells transfected with hSARAI 
alone (Figure 7D, left photo). In contrast, the localization of Smad2 in the 

20 presence of hSARAI displayed a dramatic shift to a punctate pattern (compare 

Figure 7B to 7D, centre photos). Moreover, analysis of these immunofluorescent 
staining patterns by confocal microscopy revealed that hSARAI and Smad2 
precisely colocalized in the cytosol (yellow stain, Figure 7D, right photo). 
Interestingly, expression of Smad2 at much higher levels than hSARAI reverted 

25 the distribution of Smad2 to that observed in cells transfected with Smad2 alone 
(data not shown). This supports the notion that elevating the amount of Smad2 
can saturate hSARAI and yield a diffuse distribution of Smad2 throughout the 
cell. 

Studies were conducted to determine whether activation of TGFp 
30 signaling induces nuclear translocation of Smad2 in the presence of hSARAI . As 
shown in Figure 7, the localization of hSARAI in the cytosolic compartment 
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looked similar in the presence or absence of the constitutively active TGFp type I 
receptor (compare Figure 7D and E, left photos). However, TGFp signaling 
caused a significant proportion of Smad2 to translocate to the nucleus (Figure 7E, 
centre photo) and this correlated with a shift to an orangy-red colour in the 
5 cytosolic colocalization stain (Figure 7E, right photo). Thus activation of TGFp 
signaling induces Smad2 to dissociate from hSARAI and translocate to the 
nucleus. 

To confirm that the punctate localization of overexpressed SARA reflected 
that of the endogenous protein, the localization of endogenous SARA and Smad2 

1 0 was examined in Mv1 Lu cells. Analysis of the distribution of endogenous 
hSARAI using affinity-purified rabbit anti-hSARA1 antibodies revealed a 
punctate distribution that was similar to the pattern observed for transiently 
transfected, epitope-tagged hSARAI (Figure 7F, left photo). This staining was 
specific, since cells stained with preimmune antisera, or purified antibody 

1 5 blocked with the hSARAI antigen, revealed no detectable staining in the cytosol, 
although some weak background staining was observed in the nucleus (data not 
shown). Examination of endogenous Smad2 distribution in the same cell using 
goat anti-Smad2 antibodies revealed a punctate distribution for Smad2 (Figure 
7F, centre photo) as published previously (Janknecht et al., 1998). Furthermore, 

20 analysis of hSARAI and Smad2 together revealed extensive colocalization of the 
two proteins (Figure 7F, right photo). Colocalization was not complete and may 
reflect differences in the stoichiometry of hSARAI versus Smad2 protein levels as 
suggested above, or the presence of additional regulatory mechanisms in the cell 
that control interaction of the endogenous proteins. 

25 Taken together with the biochemical analysis, these results indicate that 

hSARAI functions to anchor or recruit Smad2 to specific subcellular regions 
prior to activation by TGFp signaling. 

Example 7 - hSARAI co-localises with TBRI1 

30 The positioning of hSARAI upstream of Smad2 activation suggested to us 

that hSARAI might recruit Smad2 to specific subcellular domains for 
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phosphorylation and activation by the receptor. Interestingly, previous studies 
on the TGFp receptor demonstrated clustering of the receptor complex into 
punctate domains that resembled those displayed by hSARAI (Henis et al., 
1994). To test whether hSARAI might colocalize with TGFp receptors, the 
5 subcellular localization of hSARAI and TGFp MvlLu receptors was investigated 
in Mv1 Lu cells. Endogenous TGFp receptors could not be detected, likely due 
to the low numbers of TGFp receptors present on these cells and the even fewer 
number that are activated in the presence of ligand. The localization of hSARAI 
in Mv1 Lu cells cotransfected with TpRH and treated with TGFp was therefore 

10 examined. In the absence of hSARAI, TpRH displayed a punctate staining 

pattern similar to the hSARAI pattern (Figure 8A, panels i and ii, respectively), as 
observed previously in COS cells. Furthermore, in cells coexpressing hSARAI 
and TGFp receptors, extensive colocalization of hSARAI and TpRH was 
observed (Figure 8A, panel iii). This colocalization was not complete. This may 

15 be due to a restricted distribution of hSARAI in only a subset of the intracellular 
compartments normally occupied by transmembrane receptors, which include 
the endoplasmic reticulum, Golgi and endocytic pathways. Thus, hSARAI and 
the TGFp receptors colocalize to common subcellular domains. 

The colocalization of hSARAI and the TGFp receptors suggested the 

20 possiblity that hSARAI may interact with the TGFp receptor. To test this, a 

strategy was utilised similar to that employed to characterize the interaction of 
Smad2 with the TGFp receptor (Macias-Silva et al v 1996). Briefly, COS cells 
were cotransfected with TGFp receptors in the presence of hSARAI and were 
affinity-labelled using [ 125 l]TGFp. hSARAI was then immunoprecipitated from 

25 the cell lysates and coprecipitating receptor complexes were resolved by SDS- 
PAGE and visualized by autoradiography or were quantitated using a gamma 
counter. Analysis of cells expressing wild type receptors type II and type I, 
revealed that receptor complexes coprecipitated with hSARAI (Figure 8B, lane 
3). Furthermore, in the presence of kinase deficient type I receptor, there was a 

30 small increase in binding of hSARAI to the receptor (Figure 8B, lane 2). This is 
in contrast to Smad2, which only interacts with TGFp receptor complexes that 
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contain kinase deficient type I receptors (Maaas-Silva et al., 1996). These data 
suggest that hSARAI associates with the TGFp receptor. 

Next examined was whether coexpression of Smad2 might enhance the 
interaction of hSARAI with TGFp receptors. In cells expressing wild type 
5 receptor I, no difference in the amount of receptor complexes that coprecipitated 
with hSARAI, either in the presence or absence of Smad2, was observed (Figure 
8B, compare lanes 3 and 5). In contrast, the association of hSARAI with 
receptor complexes containing kinase-deficient type I receptors was enhanced 
by Smad2 (Figure 8B, lane 4). This finding was consistent with the previous 

10 demonstration that kinase-deficient type I receptors stabilize interactions of 

Smad2 with the receptors. To investigate further the requirement for Smad2 in 
the interaction of hSARAI with the receptor, a mutant of hSARAI, SARA(ASBD), 
that removes the Smad binding domain, was tested. Analysis of wild type 
hSARAI interaction with receptor complexes containing kinase-deficient TpRI 

1 5 showed that wild type hSARAI interacted with the receptor and this was 

enhanced approximately two-fold by Smad2 (Figure 9A). The ASBD mutant of 
hSARAI retained the capacity to associate with the receptor, although the 
efficiency of interaction was slightly reduced relative to wild type hSARAI . 
Importantly, unlike wild type hSARAI, binding of mutant hSARAI to the 

20 receptor was not enhanced by coexpression of Smad2. Together, these data 
suggest that hSARAI interacts with the TGFp receptor independently of Smad2 
binding and that Smad2 cooperates to enhance the association. 

To further characterize the domains in SARA that mediate binding to the 
TGFp receptor, the interaction of a panel of SARA mutants with the TGFp 

25 receptor was tested. Interestingly, interaction with the TGFp receptor was 

strongly suppressed in three mutants in which the FYVE domain was disrupted 
(Figure 9B; A594, A664 and the internal deletion A597-665). Since the FYVE 
domain is required for the correct subcellular localization of SARA, it was 
postulated that, once bound to the membrane, other regions in SARA might 

30 contribute to the interaction with the receptor. To examine this possibility, 
several carboxy-terminal truncation mutants of hSARAI were tested. 
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Interestingly, deletion of the C-terminus downstream of position 750 suppressed 
receptor interaction, despite efficient expression of the truncated protein. This 
suggests that regions in the carboxy-terminus of SARA contribute to receptor 
interaction. In these analyses, the question of whether overexpression of Smad2 
5 could rescue some interaction of SARA mutants with the receptor was also 
explored. For both the FYVE domain mutants and the C-terminal truncation, 
Smad2 expression was able to restore some interaction with the TGFp receptor. 
It is likely that the high levels of protein and receptor expression that are 
achieved in COS cells can drive some receptor interaction, even in the absence 
10 of appropriate localization signals. 

Example 8 - A modular domain in SARA mediates association with Smads 

To investigate the functional importance of SARA in TGFp signaling, the 
domains in the protein that mediate both its localization to specific subcellular 

1 5 regions and its interaction with Smad2 were defined. To this end, a series of 
deletion mutants of hSARAl were constructed and tested for their ability to 
interact with Smad2 in COS cells by immunoprecipitation followed by 
immunoblotting. As summarized in Figure 10, loss of the first 665 amino acids 
of hSARAl, which included the double zinc finger/FYVE domain, did not 

20 interfere with hSARAl binding to Smad2. However, further deletions (A1 -704) 
completely abolished the interaction of Smad2 with hSARAl . To map the 
carboxy-terminal boundary of the Smad binding domain, a number of C-terminal 
truncations were also analyzed. Deletion of all residues downstream of position 
750 did not affect Smad2 interaction with hSARAl, while an additional loss of 

25 85 amino acids (A665-1323) completely abrogated binding to Smad2. To 

determine whether the region defined by this deletional analysis was sufficient to 
bind Smad2, the 85 amino acids referred to as the Smad Binding Domain (SBD) 
were linked to GST and the fusion protein was expressed in bacteria (GST-h 
SARA(665-750)h Incubation of lysates prepared from cells expressing Smad2 or 

30 Smad3 with GST-SB D resulted in efficient binding of both Smads to the fusion 
protein (Figure 1 1 A). This interaction is likely direct, since bacterially expressed 
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SBD associates efficiently with bacterial ly-produced Smad2 (data not shown). 
These studies thus define a novel domain in SARA that mediates interaction with 
Smad2 and Smad3 and which is located downstream of the FYVE domain. 

The above-described analysis in COS cells showed that phosphorylation 
5 of Smad2 by the TGFp receptor induced dissociation from SARA. To determine 
whether this reflects an alteration in the ability of the SBD to bind 
phosphorylated Smad2, the interaction of CST-SBD with Smad2 in lysates 
obtained from cells expressing Smad2 alone, or Smad2 together with either wild 
type or activated TGFp type I receptor, was tested. As described previously, 
1 0 coexpression of activated type I receptors with the appropriate receptor- 
regulated Smad yields efficient phosphorylation of Smad protein. In lysates from 
cells expressing Smad2 alone or Smad2 with wild type receptors, efficient 
binding of Smad2 to GST-SBD was observed. In contrast, in the presence of 
activated TpRI, the interaction of Smad2 with GST-SBD was strongly reduced 
1 5 (Figure 1 1 B). This reduction correlated with receptor-dependent 

phosphorylation, since the phosphorylation site mutant, Smad2(2SA), interacted 
efficiently with GST-SBD, even in the presence of activated TpRI (data not 
shown). These data strongly support a mechanism whereby SARA interacts with 
unphosphorylated Smad2 and receptor-dependent phosphorylation induces 
20 dissociation by altering the affinity of Smad2 for the SBD. 



Fxam ple 9 - The FYVE domain controls the subcellular loca lization of SARA 

The subcellular localization of a selection of the SARA mutants was 
analysed by immunofluorescence and confocal microscopy. Analysis of 

25 truncation mutants that removed the amino terminus upstream of the FYVE 
domain (A1-531) yielded wild type patterns of staining (Figure 12, compare 
panels i and ii). However, a further deletion (A1-664) that disrupted the FYVE 
domain but did not interfere with the Smad binding domain, abolished the wild 
type staining pattern (Figure 1 2, panel iii). Similar studies of the C-terminal 

30 domains showed that residues downstream of the FYVE domain (A665-1 323) did 
not alter the localization of the mutant protein (Figure 12, panel iv), while 
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truncations within the FYVE domain (A596-1323) led to diffuse localization 
throughout the cell (Figure 12, panel v). Of note, the A665-1323 mutant lacked 
the Smad binding domain, thereby indicating that interaction with Smad2 is not 
required for proper SARA localization. To confirm that FYVE domain function 
5 was required for localization of SARA, a mutant with a small internal deletion 
that removes the FYVE domain (A597-664) was tested. Consistent with the other 
mutants, localization of this protein was clearly disrupted (Figure 12, panel vi). 
Since none of these mutants interfered with Smad binding, the FYVE domain 
appears to be required to maintain the normal localization of SARA but is not 
10 involved in mediating interactions with Smads. 



Example 10 - SARA-mediated localization of Smad2 is ne cessary for TCFfi 
signalin g 

The availability of mutants of hSARAI that interact with Smad2 but fail to 

1 5 target to the appropriate subcellular sites allowed the question of whether 

hSARAI -mediated localization of Smad2 was important to TGFp signaling to be 
addressed. Whether SARA(A1-594) and SARA(A1-664), which bind Smad but 
fail to distribute to the correct subcellular domains, would mislocalize Smad2 
was examined. Coexpression of either mutant with Smad2 showed that they 

20 were unable to recruit Smad2 to the normal SARA domains (Figure 1 3A, panels i 
and ii). As expected, SARA(A1-704), which lacks a Smad binding domain, was 
unable to control Smad2 localization (Figure 13A, panel iii). Whether these 
mutants could cause mislocalization of Smad2 was also examined. For this, 
cells were cotransfected with wild type hSARAI and Smad2 either in the 

25 absence or presence of SARA(A1-594), SARA(A1-664) or SARA(A1-704). In 

control transfectants, performed in the absence of mutant hSARAI, hSARAI and 
Smad2 were colocalized in punctate domains as described above (Figure 13B, 
panel i). However, in the presence of either SARA(A1-594) or SARA(A1-664), 
the localization of wild type hSARAI was normal, but the distribution of Smad2 

30 was clearly disrupted and displayed a diffuse pattern (Figure 13B, panels ii and 
iii, respectively). Moreover, coexpression of SARA(A1-704), which does not 
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bind Smad2, resulted in Smad2 distribution that was indistinguishable from that 
of the wild type pattern (Figure 13B, panel iv). Thus, SARA(A1-594) and 
SARA(A1-664) induce the mislocalization of Smad2. 

Since SARA(A1-664) mislocalizes Smads and interferes with receptor 
5 association, we investigated whether this mutant would disrupt TGFp signaling. 
To test this, we transiently transfected the TGFp-responsive reporter gene 3TP- 
lux into Mv1 Lu cells in the presence and absence of wild type or mutant 
versions of hSARAI . Expression of wild type hSARAI had no effect on TGFp 
signaling (Figure 14). In contrast, transfection of SARA(A1-664) significantly 
10 inhibited TGFp-dependent signaling at the lowest concentration of DNA tested, 
while transfection of higher doses completely abolished responsiveness of the 
cells. We also tested SARA(A1-704) which lacks a functional Smad binding 
domain and does not alter Smad2 localization. Transfection of this mutant had 
no effect on TGFp signaling (Figure 14). In addition to analysis of the 3TP 
1 5 promoter, we examined induction of the activin response element (ARE) from 
the Xenopus Mix.2 gene in HepG2 cells. 

This ARE is stimulated by either TGFp or activin signaling, which induces 
assembly of a DNA binding complex that is composed of Smad2, Smad4 and a 
member of the FAST family of forkhead DNA binding proteins. Since HepG2 
20 cells do not possess endogenous FAST activity, wild type or mutants of hSARAI 
were cotransfected with FAST2 and the ARE-lux reporter plasmid as described 
previously (Labbe et al., 1998). Expression of either SARA(1-A594) or SARA0- 
A664), which interfere with or delete the FYVE domain, respectively, resulted in 
a strong suppression of TGFp-dependent induction of the ARE (Figure 15). 
25 However, none of the other mutants tested suppressed activation of this 

promoter. Since none of these latter mutants disturb the localization of hSARAI - 
Smad2 complexes, these data strongly suggest that recruitment of Smad2 to the 
receptor-containing subcellular domains is important for TGFp signaling. 



30 Example 1 1 - Tissue distribution of hSARA expression 
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The 3'UTR of hSARAI and a Smad2 cDNA fragment were used to probe 
a human multiple tissue Northern blot (Clontech). The results are shown in 
Figure 16 - hSARAI : upper panel and Smad2: lower panel. hSARAI and Smad2 
were ubiquitously expressed in the tissues examined; relatively low levels of 
5 hSARAI were selected in liver. hSARAI and Smad2 showed a similar 
expression pattern except in placenta, where proportionally more Smad2 
message was observed. A single transcript of 5.0 kb is seen, corresponding to 
the full length hSARAI cDNA. 

SARA expression was examined in a variety of cell lines using RT-PCR 
10 analysis and the gene was found to be expressed in every cell line tested. These 
included HepG2 hepatoma cells, NBFL neuroblastoma cells, SW480 colorectal 
cancer cells, N1 H 3T3 fibroblasts, P19 embryonic carcinoma cells, MC3T3 
calvarial cells and Mv1 Lu lung epithelial cells (data not shown). hSARAI 
appears to be a ubiquitously expressed partner for Smad2 and Smad3. 

15 

Example 12 - Interaction of endogenous hSARAI and Smad2 i n mammalian 
cells 

Lysates from HepG2 cells, either untreated or treated with InM TGFp, 
were immunoprecipitated with an affinity-purified, anti-hSARA1 rabbit 

20 polyclonal antibody and the immunoprecipitates were immunoblotted with a 
polyclonal, anti-Smad2 antibody (Macias-Silva et aL, 1 998). Controls were 
immunoprecipitated with pre-immune sera or N19 anti-Smad2/3 antibody. The 
results are shown in Figure 1 7. In immunoprecipitates prepared with 
preimmune antisera, no Smad2 was detectable. Anti-hSARA1 

25 immunoprecipitates clearly showed Smad2 co-precipitating with hSARAI . TGFp 
treatment prior to lysis gave decreased association of Smad2 and SARA. 

These results demonstrate that SARA is a specific partner of receptor- 
regulated Smads in the TGFp/activin signaling pathway and further suggest that 
TGFp signaling induces dissociation of SARA/Srnad complexes. 
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The present invention is not limited to the features of the embodiments 
described herein, but includes all variations and modifications within the scope 
of the claims. 
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TABLE 1 - hSARAl - Sequence ID NO;l 

GCATACTGAATCAGCAGGACTGGCTGGTGGTGCAGCAGACATCATGAGTAAGCACCGA 
GAAGTCTGTTCCTTATCACGTGTGTAAGGGGAAAAAGGTTTAAACAAGTCTCTTAAGT 
GGTGTTTCCTCACCGATGGAGAATTACTTCCAAGCAGAAGCTTACAACCTGGGACAAG 
GTGTTAGATGAATTTGAACAAAACGAAGATGAAACAGTTTCTTCTACTTTATTGGATA 
CAAAGTGGAATAAGATTCTAGATCCCCCTTCTCACCGGCTGTCATTTAACCCTACTTT 
GGCCAGTGTGAATGAATCTGCAGTTTCTAATGAGTCACAACCACAACTGAAAGTCTTC 
TCCCTGGCTCATTCAGCTCCCCTGACCACAGAGGAAGAGGATCACTGTGCTAATGGAC 
AGGACTGTAATCTAAATC CAGAGATTGCCACAATGTGGATTG ATGAAAATGCTGTTG C 
AGAAGAC CAGTTAATTAAG AGAAACTATAGTTGGGAT GATC AATGCAGTG CTGTTGAA 
GTGGGAGAGAAGAAATGTGGAAACCTGGCTTGTCTGCCAGATGAGAAGAATGTTCTTG 
TTGTAGCCGTCATG CATAACTGTGATAAAAGG ACATTACAAAAC GATTTACAGGATTG 
TAATAATTATAATAGTCAAT CC CTTATGGATGCTTTT AG CTGTT CACTGGATAATGAA 
AACAGACAAACTG AT CAATTTAGTTTTAGTATAAATG AGTC CACTG AAAAAG ATATGA 
ATTCAGAGAAACAAATGGATCCATTGAATAGACCGAAAACAGAGGGGAGATCTGTTAA 
CCATCTGTGTCCTACTTCATCTGATAGTCTAGCCAGTGTCTGTTCCCCTTCACAATTA 
AAGGATGACGGAAGTATAGGTAGAGACCCCTCCATGTCTGCGATTACAAGTTTAACGG 
TTGATTCAGTAAT CT CAT C C CAGGGAACAGATGGATGTC CTG CTGTTAAAAAG CAAGA 
GAACTATATACCAGATGAGGACCTCACTGGCAAAATCAGCTCTCCTAGGACAGATCTA 
GGGAGTCCAAATT CCTTTTC C CAC ATGAGTGAGGGGATTTTG ATGAAAAAAGAGCCAG 
CAGAGGAGAGCACCACTGAAGAATCCCTCCGGTCTGGTTTACCTTTGCTTCTCAAACC 
AGACATGCCTAATGGGTCTGGAAGGAATAATGACTGTGAACGGTGTTCAGATTGCCTT 
GTGCCTAATGAAGTTAGGGCTGATGAAAATGAAGGTTATGAACATGAAGAAACTCTTG 
GCACTACAGAATTCCTTAATATGACAGAGCATTTCTCTGAATCTCAGGACATGACTAA 
TTGGAAGTTGACTAAACTAAATGAGATGAATGATAGCCAAGTAAACGAAGAAAAGGAA 
AAGTTTCTACAGATTAGTCAGCCTGAGGACACTAATGGTGATAGTGGAGGACAGTGTG 
TTGGATTGGCAGATGCAGGTCTAGATTTAAAAGGAACTTGCATTAGTGAAAGTGAAGA 
ATGTGATTTCTCCACTGTTATAGACACACCAGCAGCAAATTATCTATCTAATGGTTGT 
GATTCCTATGGAATG CAAGACCCAGGTGTTTCTTTTGTT CCAAAGACTTTAC C CTCCA 
AAGAAGATTCAGTAAC AGAAGAAAAAGAAATAGAGGAAAG C AAGTCAGAATG CTACTC 
AAATATTTATGAACAGAGAGGAAATGAGGCCACAGAAGGGAGTGGACTACTTTTAAAC 
AGCACTGGTGAC CTAATG AAG AAAAATTATTT AC AT AATTT CT GT AGT CAAGTTC CAT 
CAGTGCTTGGGCAATCTTCCCCCAAGGTAGTAGCAAGCCTGCCATCTATCAGTGTTCC 
TTTTGGTGGTG CAAG AC C C AAGC AACCTT CTAAT CTT AAAC TT C AAATT C C AAAGC C A 
TTATCAGACCATTTACAAAATGACTTTCCTGCAAACAGTGGAAATAATACTAAAAATA 
AAAATGATATTCTTGGGAAAGCAAAATTAGGGGAAAACTCAGCAACCAATGTATGCAG 
TCCATCTTTGGGAAACATCTCTAATGTCGATACAAATGGGGAACATTTAGAAAGTTAT 
GAGGCTGAGATCTCCACTAGACCATGCCTTGCATTAGCTC CAGATAGC CCAGATAATG 
ATCTCAGAGCTGGTCAGTTTGGAATTTCTGCCAGAAAGCCATTCACCACGCTGGGTGA 
GGTGGCTCCAGTATGGGTACCGGATTCTCAGGCTCCAAATTGCATGAAATGTGAAGCC 
AGGTTTACATTCACCAAAAGGAGGCATCACTGCAGAGCATGTGGGAAGGTTTTCTGTG 
CTTCCTGCTGTAGCCTGAAATGTAAACTGTTATACATGGACAGAAAGGAAGCTAGAGT 
GTGTGTAATCTGCCATTCAGTGCTAATGAATGCTCAAGCCTGGGAGAACATGATGAGT 
GCCTCAAGCCAGAGCCCTAACCCTAACAATCCTGCTGAATACTGTTCTACTATCCCTC 
CCTTGCAGCAAGCTCAGGCCTCAGGAGCTCTGAGCTCTCCACCTCCCACTGTGATGGT 
ACCTGTGGGAGTTTTAAAGCACCCTGGAGCAGAAGTGGCTCAGCCCAGAGAGCAGAGG 
CGAGTTTGGTTTGCTGATGGGATCTTGCCCAATGGAGAAGTTGCTGATGCAGCCAAAT 
TAACAATGAATGGAACTTCCTCTGCAGGAACCCTGGCTGTGTCACACGACCCAGTCAA 
GCCAGTAACTACCAGTCCTCTACCAGCAGAGACGGATATTTGTCTATTCTCTGGGAGT 
ATAACTCAGGTTGGAAGTCCTGTTGGAAGTGCAATGAATCTTATTCCTGAAGATGGCC 
TTCCTCCCATTCTCATCTCCACTGGTGTAAAAGGAGACTATGCTGTGGAAGAGAAACC 
ATCACAGATTTCAGTAATGCAGCAGTTGGAGGATGGTGGCCCTGACCCACTTGTATTT 
GTTTTAAATGCAAATTTGTTGTCAATGGTTAAAATTGTAAATTATGTGAACAGGAAGT 
GCTGGTGTTTCACAACCAAGGGAATGCATGCAGTGGGTCAGTCTGAGATAGTCATTCT 
TCTACAGTGTTTAC C GG ATG AAAAGTGTTTG C CAAAGG ATAT C7TTAAT C ACTTTGTG 
CAGCTTTATCGGGATGCTCTGGCAGGGAATGTGGTGAGCAACTTGGGACAT 'TCCTTCT 
TCAGTCAAAGTTTCCTTGGCAGTAAAGAACATGG7GGATTCTTATATGTGACATCTAC 
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TABLE 1 - hSARAl Continued 

CTACCAGTCACTGCAAGACCTAGTACTCCCAACCCCACCTTACTTG77TGGGATTCTT 
ATCCAGAAATGGGAAACTCCTTGGGCTAAAGTATTTCCTATCCGTCTGATGTTGAGAC 
TTGGAGCTGAATATCGACTTTATCCATGCCCACTATTCAGTGTCAGATTTCGGAAGCC 
ATTGTTTGGAGAGACGGGGCATACCATCATGAATCTTCTTGCAGACTTCAGAAATTAC 
CAGTATACCTTGCCAGTAGTTCAAGGTTTGGTGGTTGATATGGAAGTTCGGAAAACTA 
GCATCAAAATTC CCAGCAACAGATACAATG AGATGATG AAAGC CATGAAC AAG TC CAA 
TGAGCATGTCCTGGCAGGAGGTGCCTGCTTCAATGAAAAGGCAGACTCTCATCTTGTG 
TGTGTACAGAATGATGATGGAAACTATCAGACCCAGGCTATCAGTATTCACAATCAGC 
CCAGAAAAGTGACTGGTGCCAGTTTCTTTGTGTTCAGTGGCGCTCTGAAATCCTCTTC 
TGGATACCTTGCCAAGTCCAGTATTGTGGAAGATGGTGTTATGGTCCAGATTACTGCA 
GAGAACATGGATTCCTTGAGGCAGGCACTGCGAGAGATGAAGGACTTCACCATCACCT 
GTGGGAAGGCGGACGCGGAGGAACCCCAGGAGCACATCCACATCCAGTGGGTGGATGA 
TGACAAGAACGTTAGC AAGGGTGT CGTAAGT C CTATAGATGGGAAGTC C ATG G AG ACT 
ATAACAAATGTG AAGATATTC CATGGAT CAGAATATAAAGCAAATGG AAAAGTAATCA 
GATGGACAGAGGTGTTTTTCCTAGAAAACGATGACCAGCACAATTGCCTCAGTGATCC 
TGCAGATCACAGTAGATTGACTGAGCATGTTGCCAAAGCTTTTTGCCTTGCTCTCTGT 
CCTCACCTGAAACTTCTGAAGGAAGATGGAATGACCAAACTGGGACTACGTGTGACAC 
TTGACTCAGATCAGGTTGGCTAT'CAAGCAGGGAGCAATGGCCAGCCCCTTCCCTCGCA 
GTACATGAATGATCTGGATAGCGCCTTGGTGCCGGTGATCCATGGAGGGGCCTGCCAG 
CTTAGTG AGGGC C CCGTTGTCATGGAACTC ATCTTTTATATT CTG G AAAA C ATC GTAT 
AAACAGAGAAGACTTCATTTTTTTCTGTTCAGACTTGTTGCAACAGCAGTCATACCCA 
AATCATTTGCACTTTAAAACTGGAAGATTAAGCTTTTGTTAACACTATTAATGGGGTG 
GGGAATAGGGTGGGAGTGGGGGTTTGGGAGACGGGTGGGAAAGGGTGGTTGGGGGGAC 
CGATGTTCCATAATTCTAAGTCTTCTATGCATTGTCCACCAAGAAGATCTGGGCAGCT 
TCTGTTCCTGCACAACAGTTATGCTATCCTTGCAGCTAATCCCCTTCTGTTACTGTTT 
AGACAAGAATTCCGCTCCTCTCTCAAGATTTACTTATGGTCATGTGCTCAGAAATGCT 
CAAATGGGTACAACCATCACCAAGGGTGGGATGGGAGGGCAGAGGGGAAATAAAATAT 
AAAGCAT CAAAAAAAAAAAAAAAAA 
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TABLE 2 - hSARAl - Sequence ID NO; 2 

MWIDENAVAEDQLIKRNYSWDDQCSAVE^ 
TLQNDLQDOINYNSQSLMDAFSCSI^^ 

PKTEGRSVNHLCPTSSDSIASVCSPSQLIODDGSIGRDPSMSAITSLTVDSVISSOGTD 
GCPAVKKQENYI PDEDLTGKI S S PRTDLGS PNS FSHMS EG ILMKKEP AEESTTEES LR 
SGLPLLLKPDMPNGSGRNNDCERCSDCLVPNEVRADENEGYEHEETLGTTEFLNM 
FSESQDMTNWKLTKLNEMNDSQVNEEKEKFLQISQPEDTO 

GTCISESEECDFSTVIDTPAANYLSNGCDSYGMQDPGVSFVPKTLPSKEDSVTEEKEI 
EESKSECYSNIYEQRGNEATEGSGLLLNSTGDLMKKNYLHNFCSQVPSVLGQSSPKW 
ASLPSISVPFGGARPKQPSNLKLQIPKPLSDHLQNDFPANSGNNTKNKNDILGKAKLG 
ENSATNVCSPSLGNISNVDTNGEHLESYEAEISTRPCLAIiAPDSPDNDLRAGQFGISA 
RKPFTTIiGEVAP VWVPD S Q APNCMKCEARFTFTKRRHHCRAC GKVF CAS CCS LKC KLL 
YMDRKEARVCVICHSVIiMNAQAWENMMSASSQSPNPNNPAEYCSTIPPLQQAQASGAL 
S S P PPTVMVP VGVLKHPG AEVAQ PREQRRVWFAD G I L PNGE VAD AAKLTMNGTS S AGT 
LAVSHDPVKPVTTSPLPAETDICLFSGSITQVGSPVGSAMNLIPEDGLPPILISTGVK 
GDYAVEEKPSQISVMQQLEDGGPDPLVFVLNA^ 

VGQSEIVILLQCLPDEKCLPKDIFNHFVQLYRDAIiAGNWSNLGHSFFSQSFLGSKEH 
GGFLYVTSTYQSLQDLVLPTPPYLFGILIQKWETPWAKVFPIRLMLRLGAEYRLYPCP 
LFSVRFRKPLFGETGHTIM^LADFRNYQYTLPWQGL^^ 

MMKAMNKSNEHVLAGGACFNEKADSHLVCVQNIJDGNYQTQAI S I HNQ P RKVTG AS FFV 
FSGALKSSSGYLAKSSIVEDGVMVQITAENMDSLRQALREMKDFTITCGKADAEEPQE 
HIHIQWVDDDKNVSKGWSPIDGKSMETITIA^IFHGSEYKANGKVIRWTEVFFLEND 
DQHNCLSDPADHSRLTEHVAKAFCLAJLCTQLKLLKGDGMTKLGLRVTLDSDQVGYQAG 
SNGQHLPSQYMNDFDSDLVKMIHGGACQLSEGPWMELIFYILENIV 
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TABLE 3 * human SARA 2 - Sequence ID NO: 3 

ACTCCCGGCCGGGGTAGCTCTTCACTCCTCAGCGCGACGTCGTGTCGAGTTCCCAAAA 

AGCTCCGCAGGGGCTGTAGGGAGGTGATCTCATCCATTAACAGCTGTGTGTTGCCAGT 

TCCCAAATCTTTATCTATCTCAGACTTCTCTCCTGCATTCCAGATTCTTATATTCAGC 

TGCCTTTTGGATATCTCTCCCAGGATGTTCTCAAGGCATACAAGAATTAAATTCTGAA 

TAAGTCTGCAGGTAGGATGGACAGTTATTTTAAAGCAGCTGTCAGTGACTTGGACAAA 

CTCCTTGATGATTTTGAACAGAACCCAGATGAACAAGATTATCTCGCAGATGTACAAA 

ATGCATATGATTCTAACCACTGCTCAGTTTCTTCAGAGTTGGCTTCCTCACAGCGAAC 

TTCATTGCTCCCAAAAGACCAAGAGTGCGTTAATAGTTGTGCCTCATCAGAAACAAGC 

TATGGAACAAATGAGAGTTCCCTGAATGAAAAAACACTCAAGGGACTTACTTCTATAC 

AAAATGAAAAAAATGTAACAGGACTTGATCTTCTTTCTTCTGTGGATGGTGGTACTTC 

AGATGAAATCCAGCCGTTATATATGGGACGATGTAGTAAACCTATCTGTGATCTGATA 

AGTGACATGGGTAACTTAGTTCATGCAACCAATAGTGAAGAAGATATTAAAAAATTAT 

TGCCAGATGATTTTAAGTCTAATGCAGATTCCTTGATTGGATTGGATTTATCTTCAGT 

GTCAGATACTCCCTGTGTTTCTTCAACAGACCATGATAGTGATACTGTCAGAGAACAA 

CAGAATGATATCAGTTCTGAATTACAAAATAGAGAAATCGGAGGAATCAAAGAATTGG 

GTATAAAAGTAG ATACAACACTTTC AGATT C CTATAATTACAGTGG AAC AG AAAATTT 

AAAAGATAAAAAGATCTTTAATCAGTTAGAATCAATTGTTGATTTTAACATGTCATCT 

G CTTTGACT CGACAAAGTTC CAAAATGTTTCATG CCAAAGACAAG CTAC AAC ACAAG A 

G C CAG CCATGTGGATTACTAAAAGATGTTGGCTTAGTAAAAGAGG AAGT AG AT GTG G C 

AGTCATAACTGCCGCAGAATGTTTAAAAGAAGAGGGCAAGACAAGTGCTTTGACCTGC 

AG C CTTC C GAAAAATG AAGATTTATG CTT AAATGATTCAAATT C AAG AG ATG AAAATT 

TCAAATTACCTGACTTTTCCTTTCAGGAAGATAAGACTGTTATAAAACAATCTGCACA 

AGAAG ACT CAAAAAGTTTAGAC CTTAAGG ATAATG ATGTAATC C AAG ATTC CTCTTCA 

GCTTTACATGTTTCCAGTAAAGATGTGCCGTCCTCATTGTCCTGTCTTCCTGCGTCTG 

GGTCTATGTGTGGATCATTAATTGAAAGTAAAGCACGGGGTGATTTTTTACCTCAGCA 

TGAACATAAAGATAATATACAAGATGCAGTGACTATACATGAAGAAATACAGAACAGT 

GTTGTTCTAGGTGGGGAACCATTCAAAGAGAATGATCTTTTGAAACAGGAAAAATGTA 

AAAG CATACTC CTTCAGTCATT AATT G AAGGG ATGGAAGACAGAAAG ATAG AT C CTG A 

CCAGACAGTAATCAGAGCTGAGTCTTTGGATGGTGGTGACACCAGTTCTACAGTTGTA 

GAATCTCAAGAGGGGCTTTCTGGCACTCATGTCCCAGAGTCTTCTGATTGTTGTGAAG 

GTTTTATTAATACTTTTTCAAGCAATGATATGGATGGGCAAGACTTAGATTACTTTAA 

T ATTGATG AAGGCGCAAAAAGTG G C C C AC TAATTAGTG ATG CTGAACTT G ATG C CTTT 

CTGACAGAACAGTATCTTCAGACCACTAACATAAAGTCTTTTGAAGAAAATGTAAATG 

ACTCTAAATCGCAAATGAATCAGATAGATATGAAAGGCTTAGATGATGGAAACATCAA 

TAATATATATTTCAATGCAG AAG C AGG AG CTATTGGGGAAAGT C ATGG T ATT AATAT A 

ATTTGTGAAACAGTTGATAAACAAAATACAATAGAAAATGGCCTTTCTTTAGGAGAAA 

AAAGCACTATTCCAGTTCAACAAGGGTTACCTACCAGTAAGTCTGAGATTACAAATCA 

ATTATCAGTCTCTGATATTAACAGTCAATCTGTTGGAGGGGCCAGACCTAAGCAATTG 

TTTAGC CTTCCATCAAGAACAAGGAGTT CAAAGGAC CTGAATAAG C CAG AT GTTC CAG 

ATACAAT AG AAAGTG AAC C CAG CAC AG C AG ATAC C GTTGTTC C AAT C A C TT GTG CTAT 

AGATT CTACAGCTGATCCACAGGTTAG CTTCAACTCTAATTACATTG ATATAG AAAGT 

AATTCTGAAGGTGGATCTAGTTTCGTAACTGCAAATGAAGATTCTGTACCTGAAAACA 

CTTGCAAAGAAGGCTTGGTTTTGGGCCAGAAACAGCCTACTTGGGTTCCTGATTCAGA 

AGCTCCAAACTGTATGAACTGCCAAGTCAAATTTACTTTTACCAAACGGCGACACCAT 

TGCCGAGCATGTGGGAAAGTATTTTGTGGTGTCTGTTGTAATAGGAAGTGTAAACTGC 

AATATCTAGAAAAGGAAGCAAGAGTATGTGTAGTCTGCTATGAAACTATTAGTAAAGC 

TCAGGCATTTGAAAGGATGATGAGTCCAACTGGTTCTAATCTTAAGTCTAATCATTCT 

GATGAATGTACTACTGTCCAGCCTCCTCAGGAGAACCAAACATCCAGTATACCTTCAC 

CAGCAACTTTGCCAGTCTCAGCACTTAAACAACCAGGTGTTGAAGGACTATGTTCCAA 

AGAACAGAAGAGAGTATGGTTTGCAGATGGTATATTGCCCAATGGTGAAGTTGCAGAT 

ACAACAAAATTATCATCTGGAAGTAAAAGATGTTCTGAAGACTTTAGTCCTCTCTCAC 

CTGATGTGCCTATGACAGTAAACACAGTGGATCATTCCCATTCTACTACAG7GGAAAA 

G CCAAACAATG AGACAGGAGAT ATTACAAGAAATG AGATAATTC AG AGT Z CTATTTCT 

CAGGTTCCATCAGTGGAAAAATTGTCTATGAACACAGGAAATGAGGGGTTACCTACTT 

CTGGTTCATTTACACTAGATGATGATGTTTTTGCAGAAACTGAAGAACCATCTAGTCC 
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TABLE 3 - human SARA2 - Continued 

TACTG GTGT CTT AGTTAACAGCAATTTAC CTAT TG CTAGTATTT CAGATTAT AGGTTA 

CTGTGTGATATTAACAAGTATGTCTGCAATAAGATTAGTC7TCTACCTAATGATGAGG 

ACAGTTTGCCCCCACTTCTGGTTGCATCTGGAGAAAAGGGATCAGTGCCTGTAGTAGA 

AGAACAT C CAT CT CATGAGCAGATCATTTTG CT7 CTT G AAGGTG AAGG CTTT CAT C CT 

GTTACATTT GT C CTAAATGCTAATCTACTC GTGAATGT CAAATT CATATTTT ATTCCT 

CAGACAAATATTGGTACTTTTCAACCAATGGATTGCATGGCTTGGGACAGGCAGAAAT 

TATTATTCTATTGTTATGTTTGCCAAATGAAGATACTATTCCTAAGGACATCTTCAGA 

CTATTTATCACCATATATAAGGATGCTCTAAAAGGAAAATACATAGAAAACTTGGACA 

ATATTACCTTTACTGAGAGTTTTCTCAGTAGCAAGGATCACGGAGGATTCCTGTTTAT 

TACACCTACTTTTCAGAAACTTGATGATCTCTCATTACCAAGTAATCCTTTTCTTTGT 

GGAATTCTTATCCAGAAGCTTGAGATTCCCTGGGCAAAGGTTTTTCCTATGCGTTTAA 

TGTTGAGATTGGGTGCAGAATATAAAGCATATCCTGCTCCTCTAACAAGCATCAGAGG 

CCGAAAAC CT CTTTTTGGAGAAATAGGACAC ACTATTATGAACTTACTTGTTGAC CTT 

CGAAATTACCAGTATACCTTGCATAATATAGATCAACTGTTGATTCATATGGAAATGG 

GAAAAAG C TG CAT AAAAATACCACGGAAAAAGTACAGTGATGTAATGAAAGTACT AAA 

TTCTT C CAATG AGCATGTCATTAGCATTGGAG C AAGTTTC AGTACAGAAGCAG ATT CT 

CATCTAGTCTGTATACAGAATGATGGAATTTATGAAACACAGGCCAACAGTGCCACTG 

GCCATC CTAGAAAAGTGACAGGTGCAAGTTTTGTGGTATT CAATGGAGCTCTAAAAAC 

ATCTTCAGGATTTCTTGCTAAGTCCAGCATAGTTGAAGATGGCTTAATGGTACAAATA 

ACTCCAGAGACCATGAATGGCTTGCGGCTAGCTTTACGAGAACAGAAAGACTTTAAAA 

TTACATGTGGGAAAGTTGATGCAGTAGACCTGAGAGAATACGTGGATATCTGCTGGGT 

AGATGCTGAAGAAAAAGGAAACAAAGGAGTTATCAGTTCAGTGGATGGAATATCATTA 

CAAGGATTTCCAAGTGAAAAAATAAAACTGGAAGCAGATTTTGAAACCGATGAGAAGA 

TTGTAAAATGTACCGAGGTGTTCTACTTTCTAAAGGACCAGGATTTATCTATTTTATC 

AACTTCTTATCAGTTTGCAAAAGAAATAGCCATGGCTTGTAGTGCTGCGCTGTGCCCT 

CAC CTG AAAACT CTAAAAAGTAATGGGATGAATAAAATTGGACTCAGAGTTT C C ATTG 

ACACTGATATGGTTGAATTTCAGGCAGGATCTGAAGGCCAACTTCTGCCTCAGCATTA 

TCTAAATGATCTTGATAGTGCTCTGATACCTGTGATCCATGGTGGGACCTCCAACTCT 

AGTTTACCATTAGAAATAGAATTAGTGTTTTTCATTATAGAACATCTTTTTTAGTGAA 

AGAATGTG C CAT ATTACAT ATTG CAAC CTAATTTGTTAAAACTAACT C C AG C ACTAAA 

GCTGAAATG C C ACAAACACTAAAAGTATAAATATGT CTGATTTTTGAAACAC ATAAGC 

TTTGCTCTTTAGGCAGGAATGATCTTTTCAAATCATTAGCACAATATTTAAATATCTA 

AAAATTTAAGAGATCCATACTTTCTGTAGCTTTACAATTAATTTAAGTACTAAAAAGA 

CAAGG ATTT C TTTT AAG AAATTTATAG CATTT ACTGT GTT ATTT AAATG CT AAGC CAA 

AGTATCTGCACTTAGGTATACCTCTTTATGCCAATAATGATTTTAATGAAGGCTCTTT 

TCAGATGTAACCTTATGAAGGAAATATCTGCTTTGTGTATATGCCAGTTAGAATACTG 

GTTTCTAAAGTCTGTCAAATTGTATTTCAGTGGCACAAAAACCAGTTTTGAGGTCTTA 

GACTTATAATTCTTTGAATAAAACTGATAACTTATTTGTATAATTGGAGTGGAGACCT 

ACCTCCATAATTAGATAAACTCTTTTTGGATTATAATCAGAATTTTGCCTTTTTTCTT 

CTCAAATTATTACATATGTATGTATTATATATCCACATATATAGTTTTCCCTGATTAA 

ATGGATATTAAAATAATTGCGGGTGCTTCAGGACTTTTTGCTTCTATATTTAAGTATA 

TTGTTTTTATAGCAAGAACATATTCTGAATGTTTTATAAATCTTTAATAATTTATATG 

TAGGTAATATTTTTGTATCACAATGCATTATTTTTTTTCCTCCTTTCCTTCCAAACTA 

TACCACTGTATTTACCACTTCTAAGAGTGACTGACGACGGGCCAGATGACCCTTGAAG 

TAGTCATTATGTAGCAATAAATGAAGCCTGAAACAGGTTTTTTTACTTCCACTTTAAT 

CCTTAGAAATTTCTTGGCAACTTCGCATATTTTCATTGACACTGGTGTATAAGTATAA 

ATTTAAATGAACTAATTACTTTTGCATATTTTAAA7TCTTTATATGGTAGTTATTTTT 

TATAAC AG G AT ATT AAC AT AAG TTAAAT C C TAT G T ATTT G AAATT G TTAC AG AG CTTT 

CCTCTTTACTTCAAACAGCAAAAAAGTGGGGGGCATATTGTAGTCCTGTCATTTAAGT 

TATGTAAAAAATTTAATCATTATTTTGATGCTTTAAACATTCTCATGTGTAATATATG 

TTTTTGTAT CAAAAAC ACTCATATATTT C AAG AAAAAG AAATT ATGTTAAATAGCC CT 

GTTTTAAGAAAAATATTTATGAAGCATCTCAACTTGAAGATCAAGTCAAAGTTATAAC 

TCAGGATCTGAGGTCTCAAGCTAGGAGAGACTGAGAATTTTAATCAGTTTGGGCATAT 

AGTTTGGACTGAATCACATCTGTAGTACTTAGCCAAAGACAATTTGGAGGAGAATATC 

AGCCTTCTGGAAGTAGCTACTTCCTGAACAATGTAAAGTGTCGCAGATATTCAATAAA 
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TABLE 3 - lwn* an SARA2 Continued 

ATGGCAACCTGTTATAATTTGTGAAATTTATTGAAATGGTGTAAGATGAAAACAATTG 
CATATCAAACCCAATTTATGTTTTCTAAATATAGTGTATGTATTCTGCCATGTAAGTA 
ATTGAACAGTCTTAAAATAACCAAATGGTAGAGGGCTGTTCCATGATGGGACAGCTTT 
GGATTTGTTTTCATAAAATCTCTACATTCAATAAAAATTGGAATTATGTGCCTGAAGT 
TTGGAGGCACATTTTGAAGT 
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TABLE 4 - human SARA 2 - Sequence IP NOt4 

MDS Y FKAAVSDLDKLLDDFEQNPDEQDYLQDVQNAYDSNHCSVS SELAS S QRTS LLP K 
DQECVNSCASSETSYGTNESSLNEKTLKGLTSIQNEKNVTGLDLLSSVDGGTSDEIOP 
LYMGRCSKPICDLISDMGNLVHATNSEEDIKICLLPDDFKSNADSLIGLDLSSVSDTPC 
VSSTDHDSDTVREQQNDTSSELQNREIGGIKELGIKVDTTLSDSYNYSGTENLKDKKI 
FNQLESIVDFNMSSALTRQSSKMFHAKDKLQHKSQPCGLLKDVGLVKEEVDVAVITAA 
ECLKEEGKTSALTCSLPKNEDLCLNDSNSRDENFKLPDFSFQEDKTVIKQSAQEDSKS 
LDLKDNDVIQDSSSALHVSSKDVPSSLSCLPASGSMCGSLIESKARGDFLPQHEKKDN 
IQDAVTIHEEIQNSWLGGEPFKENDLLKQEKCKSILLQSLIEGMEDRKIDPDQTVIR 
AESLDGGDTSSTVVESQEGLSGTHVPESSDCCEGFINTFSSNDMDGQDLDYFNIDEGA 
KSGPLISDAELDAFLTEQYLQTTNIKSFEENVNDSKSQMNQIDMKGLDDGNI1TOIYFN 
AEAGAIGESHGINIICETVDKQNTIENGLSLGEKSTIPVQQGLPTSKSEITNQLSVSD 
INSQSVGGARPKQLFSLPSRTRSSKDLNKPDVPDTIESEPSTADTWPITCAIDSTAD 
PQVSFNSNYIDIESNSEGGSSFVTANEDSVPENTCKEGLVLGQKQPTWVPDSEAPNCM 
NCQVKFTFTKRRHHCRACGKVFCGVCCNRKCKLQYLEKEARVCVVCYET I S KAQAFER 
MMS PTGSNLKSNHSDECTTVQPPQENQTS SIPS P ATLP VS ALKQPGVEGLCS KEQKRV 
OTADGILPNGEVADTTKLSSGSKRCSEDFSPLSPDVPMTVNTVDHSHSTTVEKPNNET 
GD I TRNE 1 1 OS P I S QVP S VEKLSMNTGNEGLPTS GS FTLDDDVFAETEE P S S PTG VLV 
NSNLP IAS I SDYRLLCDINKYVCNKI SLLPNDEDSLPPLLVASGEKGS VPVVEEHP SH 
EQI ILLLEGEGFHPVTFVLNANLLVNVKF I FYS SDKYWYFSTNGLHGLGO AE III LLL 
CLPNEDTIPKDIFRLFITIYKDALKGKYIENLDNITFTESFLSSKDHGGFLFITPTFQ 
KLDDLSLPSNPFLCGILIQKLEIPWAKVFPMRLMLRLGAEYKAYPAPLTSIRGRKPLF 
GE I GHT IMNLLVDLRNYQ YTLHNIDQLL I HMEMGKS C I KI PRKKYS DVMKVLNS SNEH 
VISIGASFSTEADSHLVCIQNDGIYETQANSATGHPRKVTGASFWFNGALKTSSGFL 
AKSSIVEDGLMVQITPETMNGIiRLALREQKDFKITCGKVDAVDLREYVDICWVDAEEK 
GNKGVI S S VDGI SLQGFPSEKIKLEADFETDEKI VKCTEVFYFLKDQDLS I LS TSYQF 
AKEIAMACSAALCPHLKTLKSNGMNKIGLRVSIDTDMVEFQAGSEGQLLPQHYLNELD 
SALIPVIHGGTSNSSLPLEIELVFFI IEHLF 
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TABLE 5 - XSARA1 - Sequence ID NO; 5 

CTGTAAGTTTGACTATGTAGGAAAGCATTTCTGTTATCTATGAAGTATGTTTTAGAGT 

CAGACCAATAACTAAACGGTTTTCTTTTTTTTGTTTATTT C C C CTCAGATGAGACTGT 

CTCTCCAAAGC7A7TAGATGCTAAGTGGAATCAAA7C77AGAACCGCATTCACATAAA 

GTCG CTGATAACT CCGCC CTTGACAATGTCTG7AAA7 C AA7 CATTG CTATTGAAG C7 C 

ATCTCAAAGTCAGG7CACCCGGCTTGTCAGCCC7TG7GAGG7CCACATATGTGAATGG 

AGAAGTAGGTATTGTGGCACCTGAAATGCCCAAAATGGTGATAGGAGACACCATTATG 

GCAGAGGATTCAC77777AACAACACTGGTCCCTC7GAAA7TGTATGCAACCCATCTA 

CTGTGGAGAGTCAAAG77TACAAGCTTTAGATGATCAA7CAGTGAATATTCACAATGA 

AAAAAGTGTTC7GCTCGCTGATGGCTTTTCACCATGCAGTAGCCCCAAAAGTATTATA 

AACTTTGACTGC77GACCATGGATAACGAAATGCC777GCACAGTCAAATGAGTG7TG 

ATGACAATGACAAAGAAACTGTAACAATTTCAG7CC77CCAACAATCATACAGGATAC 

TAGTAACGTAAGCACAGAC C CAGCTATCAATAAAC C7GG C AC7AAAGAAC C CCATAG A 

GCATTAAAGGAAAC C AC ATCAGTTATTCTG C CTG AAAT AAAG C CTTACTC CACATG7G 

CTGCCCTTTCGTTT G AAAATAACAATAAGGTT C C CAG77 A7 CAATTAAATAATACAG A 

TCTACTCAGCGTTTCACCAGTGGTTGAAGCATGTAGTGAGCAGCAGCAAAAACATACA 

TCTTCCTTGCA7GAAGAAAAACTTTTTGAAGGTGT77C7GCAACGGAGTCCTTTGCAG 

CCACTGCTGCGGAAAC7GTACTGGATAATGAGGC7C7CCG7AG7GC7GAA7TCTTTGA 

CATTGTTGTAAAGAAC77TT CTGACTCTTGTGTG A77AA7 G G C G ACT7G AC7AAAAG7 

TGTGGCCTCTC7CAAGAAAGCAATGAAAAGTTTTG7GCAAG7AAAGAGT7TGAAGGAG 

GGGTAGATGCTAATG7C7TGTTGGAAAATGCATG7G7AGC77A7AAAGAAGCAATAGA 

TTTGCCTGAAGAAAA7GGAACTAATGCACCAATG7C7C7G7ACAATGGGTGTGATTCC 

TATGGAATGAAAAACC CAGCCGTAGCTCAAAAC C CAAAGAA777ACC77CAAAAGAAG 

ATTCTGTGACAGAAGAAAAAGAAATTGAAGAAAGCAAG7CAGAATACTATACTGGTGT 

TTATGAACAACAAAGAGAAGATGATGTTACAGAGAGAGG7GGACTTCTGTTAAATGC7 

AAGGCTGACCAAATGAAGAACAATTTGCATAGTC7T7G7AA7CAGGTTCCATCCATGC 

ATGGGCAAACATCACCAAAAAAGGGCAAGATTGTGCAATCTCTCAGTGTTCCATACGG 

TGGAGCACGCACTAAGCAGCCAACTCATCTCAAACTCCA7ATTCCAAAGCCATTGTCT 

GAAATGTTGCAGAGCGAT CTCATTCCTCCAAATGC7GG C7G CAGCT CTAAATACAAAA 

ATGACATGTTAAACAAA7CAAATCAGGGGGATAACCTGA777CAGAATCACTGCGTGA 

GGATTCTGCAG7GCGCAG C C CTGTTACTGATGC7AATGG7GA7TTC CCTGGAGAATAC 

AGGGGACCTGGCAGC7TGTGCCTTGCAGTGTC7CCAGACAGCCCAGACAACGATC7GC 

TTGCCGGGCAG7T7GGGG7ACCCATC7CTAAGCCA777AC7AC7C7AGGGGAAG7GGC 

TCCAGTCTGGG7GCCAGA7TCCCAAGCACCAAAC7GCA7GAAG7GCGAGGCCAGATT7 

ACATTTACCAAAAGGAGGCA7CACTGCCGAGCT7G7GGAAAGG7G77C7GTGCTGCTT 

GTTG CAGTC7AAAA7 G CAAAC7ACAGTACATGG AT AAAAAG GAG G C7 C GTGTGTG7G7 

TATTTGTCATTCTGTGCT7ATGAATGCTGAAGCATGGGAGAACATGTTAAGTGCATCG 

GTCCAAAGCCCAAATCCAAATAATCCTGCTGAATACTGC7CAACTATCCCTCCGATGC 

AGCAGGCACAAGCTTCAGGAGCACTGAGTTCCCCACC7CCCAC7GTCATGGTGCCAGT 

GGGTGTGTTAAAACATCCAGGAACTGAAGGGTCACAG7CAAAGGAACAGCGCCGTGTT 

TGGTTTGCTGATGGAATATTACCCAACGGAGAGACTGC7GAC7CAGA7AATGCAAACG 

TAACTACAGTGGCT G GGACACTTACTGTGTCACATAC C AAC AATT C C AC AT CTT GAGA 

GTCTGAGAACAC CT C T G G ATT CTGTGGAAGTATAACT CAG G77G G C AGTG C AATGAAC 

CTTATTCCAGAAGATGGGCTTCCTCCTATACTAA7C7C7AC7GGAGTAAAAGGAGATT 

ACGCAGTTGAGGAACGCCCTTCCCAGATGTCTGTGATGCAGCAAC7AGAGGAAGGAGG 

ACCAGATCCTTTGG77777GTTCTAAATGCAAATC7777GGCCA7GG7TAAGATCGTG 

AACTATGTTAACAGGAAATGCTGGTGCTTTACTACAAAGGGAATGCATGCAGTGGGCC 

AGGCTGAGATCGTAATCC7G77GCAG7GCC7GCC7GA7GAGAAG7GCC7GCCGAGGGA 

CCTGTTTAGCCAT777G77GAGC7G7ATCAGGAGGCAAT7GCAGG7AA7G7AGTGGGG 

AACCTGGGGCATTCC77CC7CAGCCAGAGTT7CC7GGG7AG7AAGGA7CATGGTGGAT 

TTCTTTATGTTGCACCAACCTACCAGTCCCTCCAGGACC7GG77C77CC7GCAGAGCC 

GTACTTGTTTGGAATCC77A77CAAAAG7GGGAGAC7CCA7GGGCCAAAG7GTTCCCC 

ATTCGGCTTATG C7G C G777AGG7GCAGAA7ACAG A77G7AC C C ATG7C CACTCTTC A 

GTGTTCGATACAGAAAACC7C7G777GGGGAAACCGGACACACCA7CA77AATGTTC7 

ATGGAAGTCAGAAAAACTAG C A77AAAATC C C CAG C AA a AGAT AC AA7G AGATG ATGA 
AAG CAA7G AACAAA7 C C AA7 G AGCA7G7G77 G G C CAT AG Gnu C A7G CTT C AAC C AGAT 
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TABLE 5 - XSARA1 Continued 

GGCAGACTCTCACCTTGTGTGTGTGCAAAACGATGATGGCAATTACCAGACCCAGGCA 
ATTAGTATCCACAAACAACCACGTAAAGTGACCGGGGCCAGCTTCTTTGTCTTCAGTG 
GTGCACTAAAGTCTTCTTCCGGATACCTGGCCAAATCCAGCATAGTAGAAGATGGGGT 
AATGGTTCAGATCACCG CAGAGAGCATGGATGC CCTCAGACAGTC C CTT C G GG AGATG 
AAGGATTTCACCATTACATGTGGAAAAGCTGATGCAGAGGAGTCACAGGAACATGTCT 
ATGTCCAGTGGGTGGAGGATGACAAGAACTTTAACAAAGGAGTTTTTAGTCCAATCGA 
TGGCAAATCAATGGAGTCTGTGACCAGCGTCAAGATTTTTCATGGCTCAGAATACAAA 
GCTAGTGGAAAAATAATTCGCTGGATAGAGGTCTTCTTTCTGGACAATGAGGAGCAAC 
AGAGTGG CCTGAGTGAC C CTG CTG AT C ACAGC C G ACTCACTG AAAATGTG G C C AAAG C 
ATTCTGTTTAGCGCTTTGCCCACACCTCAAGCTACTGAAGGAAGATGGAATGACCAGG 
TTAGGTCTGCGGGTGTCACTG GACT CAGAC CAGGTTGGATAC C AAG CTG G GAG C AATG 
GGCAACTCCTGCCTGCCCGATACACCAATGATTTGGATGGTGCTTTGGTACCAGTGAT 
ACACGGGGGCACATGCCAGTTAAGTGAAGGGCCTGTCAGTATGGAGCTGATATTTTAT 
ATCCTTGAGAACATCTCCTAGGAAAGACACATGTGTCTCCTCACAAACTGCCATCGCC 
CAAACCATTTGCACTTTAACCGCAAAAGATTCATTTTTCTTTTCTTTTGCTAACACTA 
GTATTAGGTCAGGGTGCGAGAGGCAGACACCTGAACTCTTAAACCTTCTATGCATTTT 
CACAGTAAGGATCAAGCTGCAGCTGGGAATTTCCTGTTACTAATCCAATGTGGGACGT 
TAGAAGTGATCGGTGGCACTGACTATCTAGCTGTTCAACCTTCTCTGGCTCCTCTAAG 
GACTCTAGTGCCAGGGGGTGAGACATTCAAGTTTAAAACGAAAACTCTAAATACAATC 
AGGAATCTCACTCTGACCTCATTTAAATCATCACTGCGACTTTTTTTCCTGCTCGCAT 
TCTTTATTTTGCATCTTACTCAAGTTTACATTGTCAAGACCAGCCTAAGCCTTCAGTC 
CTTTCTCAATTAAACTACTCGTGCATGGCAAGGAGACTTTCGTTGCACAGCCTGAAAT 
ATACCAATCACTTCCCAAACCACAAGCATGAATCCAACGTTTTCCTGACTGGTTGGCT 
CTGCTGTGAAAGGGACAGCAATATTATTTTTCTACAGTTGACAAAACTTTTGTCTATG 
TCTGTGTCTCTCATGGGGGATTTGTTGCCTGATGGGCAGCCTCCGGAGAGAAGAATTC 
CACCCGTGTGTAATATACAGTCTAAGTGTATGGTCTGCTATGTAACACCTGTTGCGCA 
GTGCAAATGCACTGACTCTCTGGAAGGCTATAGAGTTTTAAAAACGGTTAGTCTTTTA 
AAAAAAAAA 
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TABLE 6 - XSARA1 - Sequence ID NO* 6 
MPKMVIGDTIMAEDSLFNNTGPSEIVCNPSTVE^ 

FSPCSSPKSIINFDCLTMDNEMPLHSQMSVDDNDKETVTISVLPTIIQDTSNVSTDPA 

INKPGTKEPHRALKETTSVILPEIKPYSTCAALSFEIINI^^ 

EACSEQQQKHTSSLHEEKLFEGVSATESFAATAAETV^^ 

SCVINGDLTKS CGLS QESNEKFCASKEFEGG VDANVLLENACVAYKEAI DL P EENGTN 
APMSLYNGCDSYGMKNPAVAQNPKULPSKEDSVTEEKEIEESKSEYYTGVYEQQREDD 
VTERGGLLLNAKADQMKNNLHSLCNQVPS 

HLKLHIPKPLSEMLQSDLIPPNAGCSSKYKNDMLNKSNQGDNLISESLREDSAVRSPV 

TDANGDFPGEYRGPGSLCLAVSPDSPDlTOLIiAGQFGVPISKPFTTLGEVAPVVJVPDSQ 

APNCMKCEARFTFTKRRHHCRACGIO^CAACCSLKCKLOYMDKKEA^ 

AQAWENMLSASVQSPNPNNPAEYCSTIPPMQQAQASGALSSPPPTVMVPVGVLKHPGT 

EGSQSKEQRRWFADGILPNGETADSDNANVTTVAGTLTVSHTNNSTSSESENTSGFC 

GSITQVGSAMNLIPEDGLPPILISTGVKGDYAVEERPSQMSVMQQLEEGGPDPLVFVL 

NANLLAMVKIVNYVNRKCWCFTTKGMHAVGQAEIVILLQCLPDEKCLPRDLFSHFVEL 

YQEAIAGNWGNLGHSFLSQSFLGSKDHGGFLYVAPTYQSLQDLVLPAEPYLFGILIQ 

KWETPWAKVFP IRLMLRLGAEYRLYPCPLFS VRYRKPLFGETGHTI INVLADFRNYQ Y 

TLP WQGLWDMEVRKTS I KI PSNRYNEMMKAMNKSNEHVLAI GACFNQMADS HLVCV 

QNDDGNYQTQAISIHKQPRKVTGASFF^SGALKSSSGYLAKSSIVEDGVMVQITAES 

MDALRQSLREMKDFTITCGKADAEESQEHVYVQWVEDDKNFNKGVFSPIDGKSMESVT 

SVKIFHGSEYKASGKIIRWIEWFLDNEEQQSGLSDPADHSRLTENVAKAFCLALCPH 

LKLLKEDGMTRLGLRVSLDSDQVGYQAGSNGQLLPARYTNDLDGALVPVIHGGTCQLS 

EGP VSMEL I FY I LENI S * 
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TABLE 7 - X SARA 2 - Sequence IP NO: 7 

agttttattttcagaagacgttgcatctttattttaaacattaagtttcactatgtag 
taaaacattactgttgtatatacagtatgttgtagacatataacgtaactgtttgctt 
tgtgctttctttcctcctcagatgaaactgtctttccaaagctgttagatgctaagtg 
gaatcaattcttagaaccacattcgcataaagtca:ctgataaaccagctcttgacaat 
gtctgtaaatcaatcattgctattgaagctcatctcaaagtcaggtcacccagcttga 
cagcccttgcaaggtccacatatgtgaatggagaagtaggtattgtgactcctgaaat 

GCCTAAAATGGTGATAGGAGACAC C GATATGGCAGAGGATTC ACTTTTTAAC ACTGGT 
CCCTCTGAAATTGTATGCAACTCTATTGTGGAGAGTCAAAGTTTAGAAGTTTTAGATG 
ATGTACCAGTGAGTATTAACAATG AAAAAAGTGTTCTTCTTGATG ATG GATTTTCT CC 
GTACAGTAGCCCCAAAAGTGTTCTAAACTCTGCTTGCTTGACCATGAATAACGGAAAG 
CCCTCACACGGTCAAAAAATTGTTAATG ACCAAGATAAAGAAGCT GTAACAATTTC AG 
TCCTTCCAATG ATCATACAGGATACTACTAACGTAAGCAC AG AC C C AGCTTT C AATAA 
ATCTGGCACTGAAGAAG CTTATAGTGCATTAAAACAAACCAC AT C AGTTATTCTGC CT 
GAAATAAAGCCTTATTCCATACAGGCTGCCCTTTCATGTGAAAATATCAACAAGATAC 
CCAGATGTCAATTAAATAATACAGATCTACTCAGCATTTCACCAGTGGTTGAAGCATG 
TAGTGAGAAGCAGCAAAATCATACZAACTTCCTTGCATGAAAAAAAACTTGCAGCTGTG 
TCTGCAACTGCGTTCTTTCCAGTCACTGCTGCTGAAACTGTACTAGGTAATGAAGCTC 
TCCATAGTGCTGATTTTTTTGACATTGTTGTAAAGAACGTTTCTGACTCGTGTGTGTT 
TAATGGTGACCTAACTAGAACTAATGGACTCTCACAAGAAAACAATGAAATGTTTTAT 
GCAAGTAAAGAGTTGGAAG GAGGGGTAGATGCTAATATCTTATT GG AAG ATG CATG CA 
TAGCTTATAAAGAAAGAATAGATTTGTCTGAAGAAAATGG AACTAATG CAC CAATGTA 
TCTGTACAATGGGTGTGATTCCTATGGAATGAAAAACCCTGCTGTACGTCAAAACCCA 
AAGAATTTAC CAT CAAAAG AAG ATT CTGTG ACAGAAGAAAAAG AAATT G AAG AAAG CA 
AGTCAGAATACTATTCTGGTGTTTATGAACAACAGAAGGAAGATGACATAACTGAGAG 
AGGTGGAGTCTTGTTAAATGCCAAGGTTGACCAAATGAAGAACAGTTTGCATAGTCTT 
TATAATCCGGTTCCATCCATGCATGGGCAAACCTCACCAAAAAAGGGCAAGATTGTGC 
AATCCCTCAGTGTTCCATATGGTGGAGCTCGCCCCAAGCAGCCAACTCATCTCAAACT 
CAATATTCCACAGCCATTGTCTGAAATGTTACAGTGTGATCTCATTCCGCCAAATGCT 
GGATGCAGCTCTAAAAACAAAAATGACATGTTAAACAAAT CAAAT C GGGGGGATAACC 
TGATTTCAGAAT CACTAC GTG AGG AAGTGCACAGCC CT GTT ACT G ATAC AAATGGTG A 
AGTCCCTCGAGAAAACAGGGGACCTGGCAGCCTGTGCCTTGCAGTGTCTCCAGACAGC 
CCTG ACAATGATCTG CTT G CTGG ACAGTTTGGGGTAC C CATCT CTAAG C C ATTTACTA 
CTCTAGGGGATGTGGCTCCAGTCTGGGTGCCAGATTCCCAAGCACCAAACTGCATGAA 
GTGCGAGG C CAGATTTACATTT AC CAAAAGG AGG CATCACTG C C G AG CTT GTGGAAAG 
GTATGTAAAGAAATGTGGTGTTT CAT C AGGG C AAC AGTAAT C ACG G C AAATTATTCAT 
AACAAAATGTGTTCAG C AG ATT C AGTTAAAGTAGACTTATAAGTT AC AC AGTAACAAT 
TCATCTGCTCAGCCTCATTTTGAAGTAGATAAAATATATTTTATTAGGAAACTCTGGG 
GAGATATAAGGGAAAGCTTGCCTAAAAGTAGATGTTCTGTATATTATTTGGTAGTCAA 
AGATGATTTCATGAAAAAAGGTTATTTGTAAAAAGTACAAAATGGGTAGAGACTAGAC 
AATAAAAAGTAAGGAGTAAAAAACTAGGTATGTAACGTATATTAAAATAATTTTATGA 
TTTT A ATATTTACTG C AC ATTTT CTACAGTG C AGTG ATTTG TAT AAC C ATG C AATTAT 
CAAATGCTTAGTGCCTT CACACAAAGTG CCTTTAATAAAAATTATTTTATAAATTATC 
ATATTTTCTTTATATGTAGT CAT CAT CTTTTTTGT CTCATTT C TTG G AATC GTTCTAC 
TTATGTTCTACTGATATGTTTTTTACCCGAGACCTATCTTGTCCTCTAAAGTAATTGG 
CTTGTCAACTGGCTGTAGGGGGATTTTCAGAGTTATAGCTTAGTACTGTTAATGAGCC 
ATAGGTTGAAATAGTGCTCTAGATTTACATGTTGTACAACAGTTATTGCAATATGTGT 
AGGGGGGGGG 
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TABLE 8 - I5ARA2 - Sequence ID KOi8 
MPKMVIGDTDMAEDSLFNTGPSEIVC^ 

P YS S PKS VLNSACLTMNNGKPSHGQKIVNDQDKEAVTIS VLPMI IQDTTNVSTD PAFN 

KSGTEE AYSA LKQTTSVILPEIKPYSIQAALSCENINKIPRCQLNNTDLLSISPVVEA 

CSEKQQNHTTSLHEKKLJIAVSATAFFPVTAAETVLGNEALHSADFFD 

FNGDLTRTNGLSQEmiEMFYASKELEGGVDANILLEDACIAYKERIDLSEENGTNAPM 

YLYNGCTSYGMKNPAVRQNPKNLPSKEDSVTEEKEIEESKSEYYSGVYEQQKEDDITE 

RGGVLLNAKVDQMKNSLHSLYNPVPSMHGQTSPKKGKIVQSLSVPYGGARPKQPTHLK 

LNIPQPLSEMLQCDLIPPNAGCSSIQJKNDMLOTCSNRGDNLISESLREEVHSPVTDTNG 

EVPRENRGPGSLCLAVSPDSPDNDLLAGQFGVPISKPFTTLGDVAPVWVPDSQAPNCM 

KCEARFTFTKRRHHCRACGKVCKEMWCFIRATVITANYS 



SUBSTITUTE SHEET (RULE 26) 



WO 00/05360 



PCT/CA99/00656 



62 

TABLE 9 



hSARA 
XSARA 



hSARA 
XSARA 



hSARA 
XSARA 



TV n v i v v a vC3mn]c dk b tDono l qoQmnv h sBeTIc t* 

OP V n || WNj g KS V LL|aQG V S pfas S» f<t t t * 70 



O^IT^fwAtDfcQi k rRJy s*4T©aEls a v EpbcEk c c 
u^«Kvp Mdt| | ^A^d( sUf NNtTKiPSlEjl v£j*ps T^jE$|cj- - . 

a FffcKLpjMIENROTOOFS FS INES T&C DM*NSE|KOfib> t n1a> K tHBRE7|n mCC PTS $Bs|t A S vicbFsb lK)00 G S iErDPSMS 117 

f opjl fr ypwj E^p )l Hmd us^opMoH- . - ftTvfr itsvM>T i ■ qc'tis nv srrfc)^ aI. w^p • qt k( pna i JT 

6rnT$tffNo$p|i ssf^Toc)apfrvk»((aEK|- YfrptetofirriGit i sspRiroTfcBPNSF^EkBGi i uTriV a e ceit tetisIDi seiv 2 

(MkcTfrjsv i|l|pe iHpys Tfcjak'ib fie^MwkMstoMI M hrpU ikl - ■ v MpM*pU c s e iqq ok h t sfcti m|e c ^ y F Ejovts 1 



236 

M 



hSARA ULKPDMf NGSGAMNOCERCSOCLVPNE VRHoENphY E HE 
XSARA ATES P A j^p A AgjT V L ON 



LhS*TE(M]FS E S OD(£iT^KCTinLN£(*0S0v(NEfe 316 
FP|ltvvkiNFS Q ScM'WG0 kTK| sCG|L|S . * • 2S0 



hSARA 
XSARA 



hSARA 
XSARA 



hSARA 
XSARA 



hSARA 
XSARA 



kETTL o i sfc 

NfeKFCASKElF 



[a»BoTNGOsg8bcvGL ADAGLpEkcTci scsHEcfirsTv r offlppjA n yDsngcos tcMopPlcKTslF vb k Ups id w 

(EjFjsj glOj V DAW v| l|l|e ^*CV A^ r K^A I^L P E 6 N g}t|n|aP U sp.K r*QCPS YOM<U H*V APN ^KnTpB W ll« 



£DSVTEeKC teeSKSEJckSN t VEOl 
EPSVTEEKEI ECSKtg Y Y T G vSTeolofag 




|YCH|N Ffc]SpVPSV|L|CQSSPK V V A SfUPBiSV.PFd «7S 

1 *h|s Lfc Mav ^ s W M OQ TS pttk gk i|v|o ^Lsvpvq »«• 



PARJPKQP S NLKLO.l PKP LS 0|h|l"3nE|- FpUKscfo NfrKlNK H O I 4.GK AK tfOE^S A|T> V C S P - Sc GN I S H vp T NG Eh L E S^E A E 553 
bARj T frcOP T HLK LH J PKP EMLC< Spll iH^ NACC S fSKlYp: WPUL'NK o )a DH L I W S L R E DtSjA VR 5 P V T p ANQdI f P GE^R GP «7S 



tf» PC LA L A P O SP ONOjU RtAG OF G I B A RK PJF*T T LGE V AP.VWVPOSQAPNCMKCE AHF T-f.TK RHHHCH ACQKVFCA5CCSU «33 
• L p't A V S P DSP ONd'LI l IaQQF Q VP ' S frC P FTTL O E V A~P VWV PPSOA P M CMKCE A R FVT FtV» WH H CR A C OK V FC A A CC S U 556 



hSARA 
XSARA 



hSARA 
iSARA 



^CKLjLfYMP WKEARVCVJA^ HSy^ MW AQA WENI iMAA^|SpS PNPWMP AEY^CS J^rP-g LpOAOASGALSS PPP T VMVRV QV4.KHF1 713 
CCIClkyMD KKE AWV CV tiCMfiV JIMN AQAWE WML S~A» y 3» PMPHHP AEYCST fP PMOQAQASQALS S PPP TVMVPvdvCKWP «M 



rA^rapfflEQRRVWFADOLLFJNGEfVpfD7 
ITgfcls O sKeORWVWFADC'i VP NQEfT AO S 



■ pmMf2 



AK L| 

0 • - 



IAOTLA V6 HO P VK P v|t|tE)p L P A E T D 1 ELf SCS I TQVCISP VGEaI 

UotltvshI th ns TsglEgj £ n tpI gIf ccsttqVcI - - ■ - pa) 



705 



r 



HL-J PEDQLPP STQVKQOY AVEEKPSQI S VMQQLE DflQP QPJL V F.V.LWA N L L S A4V K I ,VH Y VH R KCWCF T T KOWHA.V GCl i73 
NCTPEPQLPPI L*t STQVKGOY AVE ERP-SQUft VMQQLE E OOP DP LVPV'L"MAML r L J AW K t VM V VWW KCWCF TTKOMMA VCO 715 



hSARA 
xSARA 



SE1.V I.LLQCLPDEKCLP KD I f NHFVu(.Y.flOAL AQNVV SMLOHSF1 ^ r . ii*-^.- 

*EIVt LLQCLPDEKCLP RPiriS HFV ELY OEfc I kpNV viGNLOHS F LPQS FLCSKOMGGFI.YV APfrYOS IQDLVLP » EpYLF 



FBQS FLOS ICE HGG F.LtV TfSp-tQS LQO L V tP TPP YJ-F 



•S3 
US 



hSARA 
XSARA 



Q 1 X*l Q K WET P WA ItV 

a ici qkwe'tpwakv 



FP I R LU4.RLGAEYR LY^CPLFSVR f RKPL FQET QHT LMNLi-ADFRNT QYT LPWOaLV VDMEVRKT) 1033 
PPI RLMLRLGAEYPVYPCPLFSVR YRKPLFQETQHT^i I ftvLVo FRNYQYT tTVVQQLV VOMEVRKTI MS 



hSARA E 1^ K 1^ SNR Y^NEIAMICAUN KS NE H V L ACDACF N EKkOSH L V CVONOOGN Y OTOAi S ^>^MpP RK V TGAS F FVFSGACKSS-&QYJ 1113 
XSARA P I K I PtMRYNgMIWCAMNKSHEHVL AjtpACF WCfcMADSH CV CVONODOMY QTQA I S I MKDP RKVT0ASFFVF5OALKSSS0Y] 1025 



hSARA 
XSARA 



LAKSt l VEDOVUVQI TAE|NMDSLBOALH£MXDFT t TCGKAOAEE P QEH t 
LAK1 < jVCQOVMVO ITA ElSpO aLrOS LREMJCDFTI TCGKADAEE S QEH V 



Nil QWV ODQKNN S^OVlvfSP 1 OGKSMET I' 
faWVEPDICwi f N K**GVj F )SP I OGKSIg Svi 



1183 

lias 



hSARA 
XSARA 



I fJIOSEYKAKpiCVa RWTE^f.LEHODOHINCLSOP ADMSRLTEHW AILAFCLALCPHLKLLKE DGMTk LG LRV. T LOSDOV GYj 1273 
I FHOSEYKAB BKHWWl EVFp'C'OH E E O Q(S GtLS DPAQWSRCTENV a'kAFCL A\CP HLK L. LKEDGMTR LG LRV SLDSDOVCYi 1115 



hSARA 
XSARA 



pAGSNGdlP 



LPSOYl 
LP A R' 



OLDS i 

l'oIgUlvf 



ALV> 

-UCVF* 



Ktl HaOACQLB£GPV|V 
vit HOQTCQLSEGPV 



WE L i_F Y I LEHlfv 1323 
U ^TTtENlis • 1235 



S ME 



hSARA 5ff7p 

XSARA SloLcEUkP^ 

K1AA030S 737LcOtrnph 

FGD1 720jLGR>»rph 

Hrs 153XAT^APp 

Hrm-2 1S3 AA ERAp fc 

EEA4 1341 TQaTnI 



TABLE 10 



PPBDAPNCUKCEURF 



Ml 



rKR RHHCRACQ KV FCUa b CStLK CK LCkImOKK 
UPNCMf7CQlVKF> fTyTKRRHHCRACaKVFCCV/bCf^rKCK LDKt Eril- 
TubMycojEPlF^B TTKRRHMCKACGHVp^ckbs feTVi AH Lf/klpN TTw 

2FTf» CfTvCTFf G"VfTT* K HHCRACC 0"i FC GK CSffWfc n= G . E K EV FtVCE PfcY^O 
/C- - ■ AEECHPCTVDF> G^i/ TRKMMCRACOOl FCGkCSSK^EfTl! pK r G i EKE^NVCEPCY 

"e PTTogcBnrtfc k gTfI- rvTVNRHHCRg StclN |i fc( Te bs akn UL -sskFi. t v rv cb a 



ETVpTCV tPHSVO 655 
E-kRVC^ t WSV q 578 
EJ<RVC^v C7E T I 

Tn^v cp-o cyI/a 




coreeGLB.. 



P-W- 
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WE CLAIM: 

1 . An isolated polynucleotide comprising a nucleotide sequence encoding a 
SARA protein or a splice variant thereof. 

5 

2. The isolated polynucleotide of claim 1, wherein the SARA protein is a 
mammalian SARA protein. 

3. The isolated polynucleotide of claim 1, wherein the SARA protein is a 
10 non-mammalian SARA protein. 

4. The isolated polynucleotide of claim 2, wherein the SARA protein is a 
human SARA protein. 

15 5. The isolated polynucleotide of claim 3, wherein the SARA protein is a 
Xenopus SARA protein. 

6. The isolated polynucleotide of claim 1, wherein the nucleotide sequence 
is selected from the group consisting of 
20 (a) a nucleotide sequence encoding the amino acid sequence of 

Sequence ID NO:2; 

(b) a nucleotide sequence encoding the amino acid sequence of 
Sequence ID NO:4; 

(c) a nucleotide sequence encoding the amino acid sequence of 
25 Sequence ID NO:6; 

(d) a nucleotide sequence encoding the amino acid sequence of 
Sequence ID NO:8; and 

(e) a nucleotide sequence encoding a SARA protein and capable of 
hybridising to a sequence complementary to the nucleotide sequence of 

30 any of (a) to (d) under stringent hybridisation conditions. 
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7. The isolated polynucleotide of claim 4 comprising the nucleotide 
sequence of Sequence ID NO:1 or a degeneracy equivalent thereof. 

8. The isolated polynucleotide of claim 4 comprising the nucleotide 
5 sequence of Sequence ID NO:3 or a degeneracy equivalent thereof. 

9. The isolated polynucleotide of claim 3 comprising the nucleotide 
sequence of Sequence ID NO:5 or a degeneracy equivalent thereof. 

10 10. The isolated polynucleotide of claim 3 comprising the nucleotide 
sequence of Sequence ID NO:7 or a degeneracy equivalent thereof. 

11. An isolated polynucleotide comprising a nucleotide sequence of at least 
10 up to the total number of consecutive nucleotides of a sequence selected 

15 from the group consisting of Sequence ID NO:1, Sequence ID NO:3, Sequence 
ID NO:5 and Sequence ID NO:7 or a nucleotide sequence complementary to 
any one of said sequences. 

12. An isolated polynucleotide comprising a nucleotide sequence encoding at 
20 least one functional domain of a SARA protein. 

13. The isolated polynucleotide of any one of the preceding claims wherein 
the polynucleotide is a polydeoxyribonucleotide. 

25 14. The isolated polynucleotide of any one of claims 1 to 1 1 wherein the 
polynucleotide is a polyribonucleotide. 

15. An isolated polynucleotide encoding a SARA protein FYVE domain. 

30 16. A recombinant vector comprising the isolated polynucleotide of any one 
of claims 1 to 1 5. 
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1 7. A host cell comprising the recombinant vector of claim 1 6. 

1 8. A process for recombinantly producing a SARA protein or a fragment 

5 thereof comprising culturing the host cell of claim 1 7 under conditions whereby 
the SARA protein or fragment thereof is expressed and isolating the expressed 
SARA protein or fragment thereof. 

19. A substantially pure SARA protein. 

10 

20. The protein of claim 19 which is a mammalian SARA protein. 

21. The protein of claim 19 which is a non-mammalian SARA protein. 

15 22. The protein of claim 20 which is a human SARA protein. 

23. The protein of claim 22 comprising the amino acid sequence of Sequence 
ID NO:2 or Sequence ID NO:4. 

20 24. The protein of claim 21 comprising the amino acid sequence of Sequence 
ID NO:6 or Sequence ID NO:8, 

25. A SARA protein that is at least 50 percent identical in amino acid 
sequence to the sequence of Sequence ID NO:2 or Sequence ID NO:4. 

25 

26. The protein of claim 25 wherein the SARA protein has a FYVE domain 
having at least 65 percent identity in amino acid sequence to the FYVE domain 
of hSARAI (Sequence ID NO:2) and a C-terminal sequence of 550 consecutive 
amino acids which have at least 50 percent identity to the C-terminal 550 amino 

30 acid residues of hSARAI . 
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27. The protein of claim 25 wherein the SARA protein has an FYVE domain 
having at least 65 percent identity in amino acid sequence to the FYVE domain 
of hSARAI (Sequence ID NO:2) and wherein the portion of the SBD 
corresponding to amino acid residues 721 to 740 of hSARAI has at least 80 

5 percent identity with that portion of hSARAI . 

28. A substantially pure polypeptide comprising an amino acid sequence of at 
least 4 up to the total number of consecutive amino acids of a sequence selected 
from the group consisting of Sequence ID NO:2, Sequence ID NO:4, Sequence 

10 ID NO:6 and Sequence ID NO:8. 

29. A substantially pure polypeptide comprising at least one functional 
domain of a SARA protein. 

15 30. A substantially pure polypeptide selected from the group consisting of 

(a) SASSQSPNPNNPAEYCSTIPPLQQAQASGALSSPPPTVMVPV 
GVLKHPGAEVAQPREQRRVWFADGILPNCEVADAAKLTMNGTSS; and 

(b) amino acids 589 to 672 of the XSARA1 sequence of Table 9. 

20 31. A substantially pure polypeptide comprising a SARA protein FYVE 
domain. 



32. The polypeptide of claim 31 comprising a polypeptide selected from the 
group consisting of 

25 (a) amino acids 587 to 655 of the hSARAI sequence of Table 9; 

(b) amino acids 510 to 578 of the XSARA1 sequence of Table 9; 

(c) the consensus amino acid sequence of Table 10; and 

(d) a functional fragment of a polypeptide of any of (a) to (c). 



30 



33. A substantially pure polypeptide comprising a SARA protein TGFp 
receptor interacting domain. 
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34. The polypeptide of claim 33 selected from the group consisting of 

(a) amino acids 751 to 1323 of the hSARAI sequence of Table 9; and 

(b) a functional fragment of polypeptide of (a). 

5 

35. A substantially pure antibody which selectively binds to an an antigenic 
determinant of a SARA protein. 

36. A cell line producing the antibody of claim 35. 

10 

37. A method for identifying an allelic variant or homologue of a human 
SARA gene comprising 

choosing a nucleic acid probe or primer capable of hybridising to a 
human SARA gene sequence under stringent hybridisation conditions; 
15 mixing the probe or primer with a sample of nucleic acids which may 

contain a nucleic acid corresponding to the homologue variant or homologue; 
and 

detecting hybridisation of the probe or primer to the nucleic acid 
corresponding to the variant or homologue. 

20 

38. A method for modulating signal transduction by a TGF|3 superfamily 
member through a SARA protein-dependent pathway, the method comprising 
modulating the binding of the SARA protein with its binding partner. 

25 39. The method of claim 38 comprising a method selected from the group 
consisting of 

(a) modulating the binding of the SARA protein to a Smad binding 
partner; 

(b) modulating the binding of the SARA protein FYVE domain to its 
30 binding partner; and 

(c) modulating the binding of the SARA protein to the TGFP receptor. 
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40. A method for preventing or treating a disorder characterised by an 
abnormality in a TGFp superfamily member signaling pathway which involves a 
SARA protein, the method comprising modulating the binding of the SARA 

5 protein involved in the pathway with its binding partner. 

41 . A method for screening a candidate compound for its potential as a 
modulator of SARA protein-dependent signaling by a TGFp superfamily member 
comprising 

10 (a) determining the ability of the compound to bind to a SARA 

protein; and 

(b) determining the ability of the compound to alter the 
phosphorylation state of a SARA protein. 

15 42. A non-human transgenic animal comprising a polynucleotide encoding a 
heterologous SARA protein or a portion thereof. 

43. The transgenic animal of claim L01 wherein the polynucleotide encodes a 
human SARA protein or a portion thereof. 

20 



44. A non-human animal having a genome from which the SARA gene has 
been deleted. 
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gctaaagtat ttcctatccg tctgatgttg 3360 
tgcccactat tcagtgtcag atttcggaag 3420 
atgaatcttc ttgcagactt cagaaattac 3480 
gtggttgata tggaagttcg gaaaactagc 3540 
atgatgaaag ccatgaacaa gtccaatgag 3600 
gaaaaggcag actctcatct tgtgtgtgta 3660 
gctatcagta ttcacaatca gcccagaaaa 3720 
ggcgctctga aatcctcttc tggatacctt 3780 
atggtccaga ttactgcaga gaacatggat 3840 
gacttcacca tcacctgtgg gaaggcggac 3900 
cagtgggtgg atgatgacaa gaacgttagc 3960 
tccatggaga ctataacaaa tgtgaagata 4020 
aaagtaatca gatggacaga ggtgtttttc 4080 
agtgatcctg cagatcacag tagattgact 4140 
ctctgtcctc acctgaaact tctgaaggaa 4200 
acacttgact cagatcaggt tggctatcaa 4260 
cagtacatga atgatctgga tagcgccttg 4320 
cttagtgagg gccccgttgt catggaactc 4380 
acagagaaga cttcattttt ttctgttcag 4440 
atttgcactt taaaactgga agattaagct 4500 
agggtgggag tgggggtttg ggagacgggt 4560 
cataattcta agtcttctat gcattgtcca 4620 
cacaacagtt atgctatcct tgcagctaat 4 680 
cgctcctctc tcaagattta cttatggtca 4740 
catcaccaag ggtgggatgg gagggcagag 4800 
aaaaaaaaa 4839 



<210> 2 
<211> 1323 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Trp lie Asp Glu Asn Ala Val Ala Glu Asp Gin Leu He Lys Arg 
15 10 15 

Asn Tyr Ser Trp Asp Asp Gin Cys Ser Ala Val Glu Val Gly Glu Lys 
20 25 30 

Lys Cys Gly Asn Leu Ala Cys Leu Pro Asp Glu Lys Asn Val Leu Val 
35 40 45 
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Val Ala Val Met 
50 

Gin Asp Cys Asn 
65 

Cys Ser Leu Asp 



lie Asn Glu Ser 
100 

Pro Leu Asn Arg 
115 

Pro Thr Ser Ser 
130 

Lys Asp Asp Gly 
145 

Ser Leu Thr Val 



Pro Ala Val Lys 
180 

Gly Lys lie Ser 
195 

Ser His Met Ser 
210 

Ser Thr Thr Glu 
225 

Pro Asp Met Pro 



Ser Asp Cys Leu 
260 

Tyr Glu His Glu 
275 

Glu His Phe Ser 
290 

Leu Asn Glu Met 
305 

Leu Gin lie Ser 



Cys Val Gly Leu 
340 

Ser Glu Ser Glu 
355 

Ala Asn Tyr Leu 



His Asn Cys Asp 
55 

Asn Tyr Asn Ser 
70 

Asn Glu Asn Arg 
85 

Thr Glu Lys Asp 



Pro Lys Thr Glu 
120 

Asp Ser Leu Ala 
135 

Ser lie Gly Arg 
150 

Asp Ser Val lie 
165 

Lys Gin Glu Asn 



Ser Pro Arg Thr 
200 

Glu Gly lie Leu 
215 

Glu Ser Leu Arg 
230 

Asn Gly Ser Gly 
245 

Val Pro Asn Glu 



Glu Thr Leu Gly 
280 

Glu Ser Gin Asp 
295 

Asn Asp Ser Gin 
310 

Gin Pro Glu Asp 

325 

Ala Asp Ala Gly 



Glu Cys Asp Phe 
360 

Ser Asn Gly Cys 



Lys Arg Thr Leu 
60 

Gin Ser Leu Met 
75 

Gin Thr Asp Gin 
90 

Met Asn Ser Glu 
105 

Gly Arg Ser Val 



Ser Val Cys Ser 
140 

Asp Pro Ser Met 
155 

Ser Ser Gin Gly 
170 

Tyr He Pro Asp 
185 

Asp Leu Gly Ser 



Met Lys Lys Glu 
220 

Ser Gly Leu Pro 
235 

Arg Asn Asn Asp 
250 

Val Arg Ala Asp 
265 

Thr Thr Glu Phe 



Met Thr Asn Trp 
300 

Val Asn Glu Glu 
315 

Thr Asn Gly Asp 
330 

Leu Asp Leu Lys 
345 

Ser Thr Val He 



Asp Ser Tyr Gly 



Gin Asn Asp Leu 



Asp Ala Phe Ser 
80 

Phe Ser Phe Ser 
95 

Lys Gin Met Asp 
110 

Asn His Leu Cys 
125 

Pro Ser Gin Leu 



Ser Ala He Thr 
160 

Thr Asp Gly Cys 
175 

Glu Asp Leu Thr 
190 

Pro Asn Ser Phe 
205 

Pro Ala Glu Glu 



Leu Leu Leu Lys 
240 

Cys Glu Arg Cys 
255 

Glu Asn Glu Gly 
270 

Leu Asn Met Thr 
285 

Lys Leu Thr Lys 



Lys Glu Lys Phe 
320 

Ser Gly Gly Gin 
335 

Gly Thr Cys He 
350 

Asp Thr Pro Ala 
365 

Met Gin Asp Pro 
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370 



375 



380 



Gly Val Ser Phe 

385 

Thr Glu Glu Lys 



lie Tyr Glu Gin 
420 

Leu Asn Ser Thr 
435 

Cys Ser Gin Val 
450 

Ala Ser Leu Pro 
465 

Gin Pro Ser Asn 



Leu Gin Asn Asp 
500 

Asn Asp lie Leu 
515 

Val Cys Ser Pro 
530 

Glu His Leu Glu 
545 

Ala Leu Ala Pro 



Gly lie Ser Ala 
580 

Val Trp Val Pro 
595 

Arg Phe Thr Phe 
610 

Val Phe Cys Ala 
625 

Asp Arg Lys Glu 



Asn Ala Gin Ala 
660 

Asn Pro Asn Asn 
675 

Gin Ala Gin Ala 
690 



Val Pro Lys Thr 
390 

Glu He Glu Glu 
405 

Arg Gly Asn Glu 



Gly Asp Leu Met 
440 

Pro Ser Val Leu 
455 

Ser He Ser Val 
470 

Leu Lys Leu Gin 
485 

Phe Pro Ala Asn 



Gly Lys Ala Lys 
520 

Ser Leu Gly Asn 
535 

Ser Tyr Glu Ala 
550 

Asp Ser Pro Asp 
565 

Arg Lys Pro Phe 



Asp Ser Gin Ala 
600 

Thr Lys Arg Arg 
615 

Ser Cys Cys Ser 
630 

Ala Arg Val Cys 
645 

Trp Glu Asn Met 



Pro Ala Glu Tyr 
680 

Ser Gly Ala Leu 
695 



Leu Pro Ser Lys 
395 

Ser Lys Ser Glu 
410 

Ala Thr Glu Gly 
425 

Lys Lys Asn Tyr 



Gly Gin Ser Ser 
460 

Pro Phe Gly Gly 
475 

He Pro Lys Pro 
490 

Ser Gly Asn Asn 
505 

Leu Gly Glu Asn 



He Ser Asn Val 
540 

Glu He Ser Thr 
555 

Asn Asp Leu Arg 
570 

Thr Thr Leu Gly 
585 

Pro Asn Cys Met 



His His Cys Arg 
620 

Leu Lys Cys Lys 
635 

Val He Cys His 
650 

Met Ser Ala Ser 
665 

Cys Ser Thr lie 



Ser Ser Pro Pro 
700 



Glu Asp Ser Val 
400 

Cys Tyr Ser Asn 
415 

Ser Gly Leu Leu 
430 

Leu His Asn Phe 
445 

Pro Lys Val Val 



Ala Arg Pro Lys 
480 

Leu Ser Asp His 
495 

Thr Lys Asn Lys 
510 

Ser Ala Thr Asn 
525 

Asp Thr Asn Gly 



Arg Pro Cys Leu 
560 

Ala Gly Gin Phe 
575 

Glu Val Ala Pro 
590 

Lys Cys Glu Ala 
605 

Ala Cys Gly Lys 



Leu Leu Tyr Met 
640 

Ser Val Leu Met 
655 

Ser Gin Ser Pro 
670 

Pro Pro Leu Gin 
685 

Pro Thr Val Met 



Val Pro Val Gly Val Leu Lys His Pro Gly Ala Glu Val Ala Gin Pro 
705 710 715 720 



Arg Glu Gin Arg Arg Val Trp Phe Ala Asp Gly lie Leu Pro Asn Gly 
" 725 730 " 735 

Glu Val Ala Asp Ala Ala Lys Leu Thr Met Asn Gly Thr Ser Ser Ala 
740 745 750 

Gly Thr Leu Ala Val Ser His Asp Pro Val Lys Pro Val Thr Thr Ser 
755 760 765 

Pro Leu Pro Ala Glu Thr Asp lie Cys Leu Phe Ser Gly Ser lie Thr 
770 775 780 

Gin Val Gly Ser Pro Val Gly Ser Ala Met Asn Leu He Pro Glu Asp 
785 790 795 800 

Gly Leu Pro Pro He Leu He Ser Thr Gly Val Lys Gly Asp Tyr Ala 
805 810 815 

Val Glu Glu Lys Pro Ser Gin He Ser Val Met Gin Gin Leu Glu Asp 
820 825 830 

Gly Gly Pro Asp Pro Leu Val Phe Val Leu Asn Ala Asn Leu Leu Ser 
835 840 845 

Met Val Lys He Val Asn Tyr Val Asn Arg Lys Cys Trp Cys Phe Thr 
850 855 860 

Thr Lys Gly Met His Ala Val Gly Gin Ser Glu He Val He Leu Leu 
865 870 875 880 

Gin Cys Leu Pro Asp Glu Lys Cys Leu Pro Lys Asp He Phe Asn His 
885 890 895 

Phe Val Gin Leu Tyr Arg Asp Ala Leu Ala Gly Asn Val Val Ser Asn 
900 905 910 

Leu Gly His Ser Phe Phe Ser Gin Ser Phe Leu Gly Ser Lys Glu His 
915 920 925 

Gly Gly Phe Leu Tyr Val Thr Ser Thr Tyr Gin Ser Leu Gin Asp Leu 
930 935 940 

Val Leu Pro Thr Pro Pro Tyr Leu Phe Gly He Leu He Gin Lys Trp 
945 950 955 960 

Glu Thr Pro Trp Ala Lys Val Phe Pro He Arg Leu Met Leu Arg Leu 
965 970 975 

Gly Ala Glu Tyr Arg Leu Tyr Pro Cys Pro Leu Phe Ser Val Arg Phe 
980 985 990 



Arg Lys Pro Leu Phe Gly Glu Thr Gly His Thr He Met Asn Leu Leu 
995 1000 1005 



Ala Asp Phe Arg Asn Tyr Gin Tyr Thr Leu Pro Val Val Gin Gly Leu 
1010 1015 1020 



Val Val Asp Met Glu Val Arg Lys Thr Ser lie Lys He Pro Ser Asn 
1025 1030 1035 1040 

Arg Tyr Asn Glu Met Met Lys Ala Met Asn Lys Ser Asn Glu His Val 
1045 1050 1055 

Leu Ala Gly Gly Ala Cys Phe Asn Glu Lys Ala Asp Ser His Leu Val 
1060 1065 1070 

Cys Val Gin Asn Asp Asp Gly Asn Tyr Gin Thr Gin Ala He Ser He 
1075 1080 1085 

His Asn Gin Pro Arg Lys Val Thr Gly Ala Ser Phe Phe Val Phe Ser 
1090 1095 1100 

Gly Ala Leu Lys Ser Ser Ser Gly Tyr Leu Ala Lys Ser Ser He Val 
1105 1110 1H5 H20 

Glu Asp Gly Val Met Val Gin He Thr Ala Glu Asn Met Asp Ser Leu 
1125 1130 1135 

Arg Gin Ala Leu Arg Glu Met Lys Asp Phe Thr He Thr Cys Gly Lys 
1140 H45 1150 

Ala Asp Ala Glu Glu Pro Gin Glu His He His He Gin Trp Val Asp 
1155 1160 H65 

Asp Asp Lys Asn Val Ser Lys Gly Val Val Ser Pro lie Asp Gly Lys 
1170 H75 H80 

Ser Met Glu Thr He Thr Asn Val Lys He Phe His Gly Ser Glu Tyr 
1185 1190 H95 1200 

Lys Ala Asn Gly Lys Val He Arg Trp Thr Glu Val Phe Phe Leu Glu 
1205 1210 1215 

Asn Asp Asp Gin His Asn Cys Leu Ser Asp Pro Ala Asp His Ser Arg 
1220 1225 1230 

Leu Thr Glu His Val Ala Lys Ala Phe Cys Leu Ala Leu Cys Thr Gin 
1235 1240 1245 

Leu Lys Leu Leu Lys Gly Asp Gly Met Thr Lys Leu Gly Leu Arg Val 
1250 1255 1260 

Thr Leu Asp Ser Asp Gin Val Gly Tyr Gin Ala Gly Ser Asn Gly Gin 
1265 1270 1275 1280 

His Leu Pro Ser Gin Tyr Met Asn Asp Phe Asp Ser Asp Leu Val Lys 
1285 1290 1295 

Met He His Gly Gly Ala Cys Gin Leu Ser Glu Gly Pro Val Val Met 
1300 1305 1310 

Glu Leu He Phe Tyr lie Leu Glu Asn lie Val 
1315 1320 



<210> 3 
<211> 6632 
<212> DNA 



<213> Homo sapiens 



<400> 3 

act cccggcc 

ctccgcaggg 

aaat ctttat 

ttggatatct 

caggtaggat 

attt tgaaca 

ctaaccactg 

aagaccaaga 

gttccctgaa 

caggacttga 

atatgggacg 

atgcaaccaa 

cagattcctt 

cagaccatga 

atagagaaat 

cctataatta 

caattgttga 

ccaaagacaa 

taaaagagga 

agacaagtgc 

attcaagaga 

taaaacaat c 

aagattcct c 

ttcctgcgtc 

tacctcagca 

agaacagtgt 

aatgtaaaag 

ctgaccagac 

tagaatctca 

gttttattaa 

ttgatgaagg 

cagaacagta 

aatcgcaaat 

atttcaatgc 

cagttgataa 

cagttcaaca 

atattaacag 

gaacaaggag 

ccagcacagc 

aggttagctt 

tcgtaactgc 

gccagaaaca 

tcaaatttac 

gtgtctgttg 

tagtctgcta 

gttctaatct 

accaaacatc 

gtgttgaagg 

ccaatggtga 

actttagtcc 

ctactacagt 

agagtcctat 

ggttacctac 

catctagtcc 

ataggttact 

atgaggacag 

tagaagaaca 

ctgttacatt 



ggggtagctc 
gctgtaggga 
ctatctcaga 
ctcccaggat 
ggacagttat 
gaacccagat 
ctcagtttct 
gtgcgttaat 
tgaaaaaaca 
tcttctttct 
atgtagtaaa 
tagtgaagaa 
gattggattg 
tagtgatact 
cggaggaatc 
cagtggaaca 
ttttaacatg 
gctacaacac 
agtagatgtg 
tttgacctgc 
tgaaaatttc 
tgcacaagaa 
ttcagcttta 
tgggtctatg 
tgaacataaa 
tgttctaggt 
catactcctt 
agtaatcaga 
agaggggctt 
tactttttca 
cgcaaaaagt 
tcttcagacc 
gaatcagata 
agaagcagga 
acaaaataca 
agggttacct 
tcaatctgtt 
ttcaaaggac 
agataccgtt 
caactctaat 
aaatgaagat 
gcctacttgg 
ttttaccaaa 
taataggaag 
tgaaactatt 
taagtctaat 
cagtatacct 
actatgttcc 
agttgcagat 
tctctcacct 
ggaaaagcca 
ttctcaggtt 
ttctggttca 
tactggtgtc 
gtgtgatatt 
tttgccccca 
tccatctcat 
tgtcctaaat 



ttcactcctc 
ggtgatctca 
cttctctcct 
gttctcaagg 
tttaaagcag 
gaacaagatt 
tcagagttgg 
agttgtgcct 
ctcaagggac 
tctgtggatg 
cctatctgtg 
gatattaaaa 
gatttatctt 
gtcagagaac 
aaagaattgg 
gaaaatttaa 
tcatctgctt 
aagagccagc 
gcagtcataa 
agccttccga 
aaattacctg 
gactcaaaaa 
catgtttcca 
tgtggatcat 
gataatatac 
ggggaaccat 
cagtcattaa 
gctgagt ctt 
tctggcactc 
agcaatgata 
ggcccactaa 
actaacataa 
gatatgaaag 
gctattgggg 
atagaaaatg 
accagtaagt 
ggaggggcca 
ctgaataagc 
gttccaatca 
tacattgata 
tctgtacctg 
gttcctgatt 
cggcgacacc 
tgtaaactgc 
agtaaagctc 
cattctgatg 
tcaccagcaa 
aaagaacaga 
acaacaaaat 
gatgtgccta 
aacaatgaga 
ccatcagtgg 
tttacactag 
ttagttaaca 
aacaagtatg 
cttctggttg 
gagcagatca 
gctaatctac 



agcgcgacgt 
tccattaaca 
gcattccaga 
catacaagaa 
ctgtcagtga 
atctcgcaga 
cttcctcaca 
catcagaaac 
ttacttctat 
gtggtacttc 
atctgataag 
aattattgcc 
cagtgtcaga 
aacagaatga 
gtataaaagt 
aagataaaaa 
tgactcgaca 
catgtggatt 
ctgccgcaga 
aaaatgaaga 
acttttcctt 
gtttagacct 
gtaaagatgt 
taattgaaag 
aagatgcagt 
tcaaagagaa 
ttgaagggat 
tggatggtgg 
atgtcccaga 
tggatgggca 
ttagtgatgc 
agtcttttga 
gcttagatga 
aaagtcatgg 
gcctttcttt 
ctgagattac 
gacctaagca 
cagatgttcc 
cttgtgctat 
tagaaagtaa 
aaaacacttg 
cagaagctcc 
attgccgagc 
aatatctaga 
aggcatttga 
aatgtactac 
ctttgccagt 
agagagtatg 
tatcatctgg 
tgacagtaaa 
caggagatat 
aaaaattgtc 
atgatgatgt 
gcaatttacc 
t ctgcaataa 
catctggaga 
ttttgcttct 
tcgtgaatgt 



cgtgtcgagt 
gctgtgtgtt 
ttcttatat t 
ttaaattctg 
cttggacaaa 
tgtacaaaat 
gcgaacttca 
aagctatgga 
acaaaatgaa 
agatgaaatc 
tgacatgggt 
agatgatttt 
tactccctgt 
tatcagttct 
agatacaaca 
gatctttaat 
aagttccaaa 
actaaaagat 
atgtttaaaa 
tttatgctta 
tcaggaagat 
taaggataat 
gccgtcct ca 
taaagcacgg 
gactatacat 
tgatcttttg 
ggaagacaga 
tgacaccagt 
gtcttctgat 
agacttagat 
tgaacttgat 
agaaaatgta 
tggaaacatc 
tattaatata 
aggagaaaaa 
aaatcaatta 
attgtttagc 
agatacaata 
agattctaca 
ttctgaaggt 
caaagaaggc 
aaactgtatg 
atgtgggaaa 
aaaggaagca 
aaggatgatg 
tgtccagcct 
ctcagcactt 
gtttgcagat 
aagtaaaaga 
cacagtggat 
tacaagaaat 
tatgaacaca 
ttttgcagaa 
tattgctagt 
gattagtctt 
aaagggatca 
tgaaggtgaa 
caaattcata 



t cccaaaaag 
gccagttccc 
cagctgcctt 
aataagtctg 
ctccttgatg 
gcatatgatt 
t tgctcccaa 
acaaatgaga 
aaaaatgtaa 
cagccgttat 
aacttagttc 
aagtctaatg 
gtttcttcaa 
gaattacaaa 
cttt cagatt 
cagttagaat 
atgtttcatg 
gttggcttag 
gaagagggca 
aatgatt caa 
aagactgtta 
gatgtaatcc 
ttgtcctgtc 
ggtgattttt 
gaagaaatac 
aaacaggaaa 
aagatagatc 
tctacagttg 
tgttgtgaag 
tactttaata 
gcctttctga 
aatgactcta 
aataatatat 
atttgtgaaa 
agcactattc 
tcagtctctg 
cttccatcaa 
gaaagtgaac 
gctgatccac 
ggatctagtt 
ttggttttgg 
aactgccaag 
gtattttgtg 
agagtatgtg 
agtccaactg 
cctcaggaga 
aaacaaccag 
ggtatattgc 
tgttctgaag 
cattcccatt 
gagataattc 
ggaaatgagg 
actgaagaac 
atttcagatt 
ctacctaatg 
gtgcctgtag 
ggctttcatc 
ttttattcct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 
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cagacaaata 

ttattctatt 

ttatcaccat 

cct ttactga 

ctt ttcagaa 

tccagaagct 

gtgcagaata 

ttggagaaat 

ccttgcataa 

taccacggaa 

ttagcattgg 

atggaattta 

caagttttgt 

gcatagttga 

tagctttacg 

tgagagaata 

tcagttcagt 

cagattttga 

accaggattt 

gtagtgctgc 

gact cagagt 

ttct gcctca 

ggacctccaa 

ttttttagtg 

cagcactaaa 

acataagctt 

tatctaaaaa 

aagacaagga 

aaagtat ctg 

tcagatgtaa 

ttctaaagtc 

tataattctt 

ataattagat 

attacatatg 

aaataattgc 

caagaacata 

gtatcacaat 

acttctaaga 

aaatgaagcc 

cttcgcatat 

tgcatatttt 

aaatcctatg 

tggggggcat 

tgctttaaac 

agaaaaagaa 

cttgaagatc 

gaattttaat 

agacaatttg 

tgtcgcagat 

gtaagatgaa 

ttctgccatg 

gatgggacag 

tgtgcctgaa 



ttggtacttt 
gttatgtttg 
atataaggat 
gagttttctc 
acttgatgat 
tgagattccc 
taaagcatat 
aggacacact 
tatagatcaa 
aaagtacagt 
agcaagtttc 
tgaaacacag 
ggtattcaat 
agatggctta 
agaacagaaa 
cgtggatatc 
ggatggaata 
aaccgatgag 
atctatttta 
gctgtgccct 
ttccattgac 
gcattatcta 
ctctagttta 
aaagaatgtg 
gctgaaatgc 
tgctctttag 
tttaagagat 
tttcttttaa 
cacttaggta 
ccttatgaag 
tgtcaaattg 
tgaataaaac 
aaactctttt 
tatgtattat 
gggtgcttca 
ttctgaatgt 
gcattatttt 
gtgactgacg 
tgaaacaggt 
tttcattgac 
aaattcttta 
tatttgaaat 
attgtagtcc 
attctcatgt 
attatgttaa 
aagtcaaagt 
cagtttgggc 
gaggagaata 
attcaataaa 
aacaattgca 
taagtaattg 
ctttggattt 
gtttggaggc 



tcaaccaatg 
ccaaatgaag 
get ctaaaag 
agtagcaagg 
ctctcattac 
tgggcaaagg 
cctgctcctc 
attatgaact 
ctgttgattc 
gatgtaatga 
agtacagaag 
gecaacagtg 
ggagctctaa 
atggtacaaa 
gactttaaaa 
tgctgggtag 
tcattacaag 
aagattgtaa 
tcaacttctt 
cacctgaaaa 
actgatatgg 
aatgatcttg 
ccattagaaa 
ccatattaca 
cacaaacact 
gcaggaatga 
ccatactttc 
gaaatttata 
tacctcttta 
gaaatatctg 
tatttcagtg 
tgataactta 
tggattataa 
atatccacat 
ggactttttg 
tttataaatc 
ttttcctcct 
aegggecaga 
ttttttactt 
actggtgtat 
tatggtagtt 
tgttacagag 
tgtcatttaa 
gtaatatatg 
atagecctgt 
tataactcag 
atatagtttg 
tcagccttct 
atggcaacct 
tatcaaaccc 
aacagtctta 
gttttcataa 
acattttgaa 



gattgeatgg 
atactattcc 
gaaaatacat 
ateaeggagg 
caagtaatcc 
tttttcctat 
taacaagcat 
tacttgttga 
atatggaaat 
aagtactaaa 
cagattctca 
ccactggcca 
aaacatcttc 
taactccaga 
ttacatgtgg 
atgetgaaga 
gatttccaag 
aatgtaccga 
atcagtttgc 
ctctaaaaag 
ttgaatttca 
atagtgetet 
tagaattagt 
tattgeaace 
aaaagtataa 
tcttttcaaa 
tgtagcttta 
gcatttactg 
tgccaataat 
ctttgtgtat 
gcacaaaaac 
tttgtataat 
tcagaatttt 
atatagtttt 
cttctatatt 
tttaataatt 
ttcctt ccaa 
tgacccttga 
ccactttaat 
aagtataaat 
attttttata 
ctttcctctt 
gttatgtaaa 
tttttgtatc 
tttaagaaaa 
gatctgaggt 
gactgaatca 
ggaagtagct 
gttataattt 
aatttatgtt 
aaataaccaa 
aatctctaca 

gt 



cttgggacag 
taaggacat c 
agaaaacttg 
attcctgttt 
ttttctttgt 
gcgtttaatg 
cagaggcega 
ccttcgaaat 
gggaaaaagc 
ttcttccaat 
tctagtctgt 
tcctagaaaa 
aggatttctt 
gaccatgaat 
gaaagttgat 
aaaaggaaac 
tgaaaaaata 
ggtgttctac 
aaaagaaata 
taatgggatg 
ggcaggatct 
gatacctgtg 
gtttttcatt 
taatttgtta 
atatgtctga 
tcattagcac 
caattaattt 
tgttatttaa 
gattttaatg 
atgccagtta 
cagttttgag 
tggagtggag 
gectttttte 
ccctgattaa 
taagtatatt 
tatatgtagg 
actataccac 
agtagtcatt 
ccttagaaat 
ttaaatgaac 
acaggatatt 
tacttcaaac 
aaatttaatc 
aaaaacactc 
atatttatga 
ctcaagctag 
catctgtagt 
acttcctgaa 
gtgaaattta 
ttctaaatat 
atggtagagg 
ttcaataaaa 



gcagaaatta 
ttcagactat 
gacaatatta 
attacaccta 
ggaattctta 
ttgagattgg 
aaacctcttt 
taccagtata 
tgcataaaaa 
gagcat gt ca 
atacagaatg 
gtgacaggtg 
gctaagtcca 
ggcttgegge 
gcagtagacc 
aaaggagtta 
aaactggaag 
tttctaaagg 
gecatggett 
aataaaattg 
gaaggecaac 
atccatggtg 
atagaacatc 
aaactaactc 
tttttgaaac 
aatatttaaa 
aagtactaaa 
atgetaagee 
aaggctcttt 
gaatactggt 
gtcttagact 
acctacctcc 
ttctcaaatt 
atggatatta 
gtttttatag 
taatattttt 
tgtatttacc 
atgtagcaat 
ttcttggcaa 
taattacttt 
aacataagtt 
agcaaaaaag 
attattttga 
atatatttca 
agcatctcaa 
gagagactga 
acttagecaa 
caatgtaaag 
ttgaaatggt 
agtgtatgta 
gctgttccat 
attggaatta 



3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6632 



<210> 4 
<211> 1539 
<212> PRT 

<213> Homo sapiens 



<400> 4 



Met Asp Ser Tyr 
1 

Asp Asp Phe Glu 
20 

Gin Asn Ala Tyr 
35 

Ser Ser Gin Arg 
50 

Ser Cys Ala Ser 
65 

Asn Glu Lys Thr 



Val Thr Gly Leu 
100 

Glu lie Gin Pro 
115 

Leu lie Ser Asp 
130 

Asp lie Lys Lys 
145 

Leu lie Gly Leu 



Ser Thr Asp His 
180 

Ser Ser Glu Leu 
195 

lie Lys Val Asp 
210 

Glu Asn Leu Lys 
225 

Asp Phe Asn Met 



His Ala Lys Asp 
260 

Lys Asp Val Gly 
275 

Ala Ala Glu Cys 
290 

Ser Leu Pro Lys 
305 

Asp Glu Asn Phe 



Phe Lys Ala Ala 

5 

Gin Asn Pro Asp 



Asp Ser Asn His 
40 

Thr Ser Leu Leu 
55 

Ser Glu Thr Ser 
70 

Leu Lys Gly Leu 
85 

Asp Leu Leu Ser 



Leu Tyr Met Gly 
120 

Met Gly Asn Leu 
135 

Leu Leu Pro Asp 
150 

Asp Leu Ser Ser 
165 

Asp Ser Asp Thr 



Gin Asn Arg Glu 
200 

Thr Thr Leu Ser 
215 

Asp Lys Lys lie 

230 

Ser Ser Ala Leu 
245 

Lys Leu Gin His 



Leu Val Lys Glu 
280 

Leu Lys Glu Glu 
295 

Asn Glu Asp Leu 
310 

Lys Leu Pro Asp 



Val Ser Asp Leu 
10 

Glu Gin Asp Tyr 
25 

Cys Ser Val Ser 



Pro Lys Asp Gin 
60 

Tyr Gly Thr Asn 
75 

Thr Ser He Gin 
90 

Ser Val Asp Gly 
105 

Arg Cys Ser Lys 



Val His Ala Thr 
140 

Asp Phe Lys Ser 
155 

Val Ser Asp Thr 
170 

Val Arg Glu Gin 
185 

He Gly Gly He 



Asp Ser Tyr Asn 
220 

Phe Asn Gin Leu 

235 

Thr Arg Gin Ser 
250 

Lys Ser Gin Pro 
265 

Glu Val Asp Val 



Gly Lys Thr Ser 
300 

Cys Leu Asn Asp 
315 

Phe Ser Phe Gin 



Asp Lys Leu Leu 
15 

Leu Gin Asp Val 
30 

Ser Glu Leu Ala 
45 

Glu Cys Val Asn 



Glu Ser Ser Leu 
80 

Asn Glu Lys Asn 
95 

Gly Thr Ser Asp 
110 

Pro He Cys Asp 
125 

Asn Ser Glu Glu 



Asn Ala Asp Ser 
160 

Pro Cys Val Ser 
175 

Gin Asn Asp Thr 
190 

Lys Glu Leu Gly 

205 

Tyr Ser Gly Thr 



Glu Ser He Val 
240 

Ser Lys Met Phe 
255 

Cys Gly Leu Leu 
270 

Ala Val He Thr 

285 

Ala Leu Thr Cys 



Ser Asn Ser Arg 
320 

Glu Asp Lys Thr 
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325 



330 



335 



Val lie Lys Gin 
340 

Asp Asn Asp Val 
355 

Lys Asp Val Pro 
370 

Cys Gly Ser Leu 
385 

His Glu His Lys 



lie Gin Asn Ser 
420 

Leu Leu Lys Gin 
435 

Glu Gly Met Glu 
450 

Ala Glu Ser Leu 
465 

Gin Glu Gly Leu 



Glu Gly Phe lie 
500 

Leu Asp Tyr Phe 
515 

Ser Asp Ala Glu 
530 

Thr Asn lie Lys 
545 

Met Asn Gin lie 



lie Tyr Phe Asn 
580 

Asn lie lie Cys 
595 

Leu Ser Leu Gly 
610 

Thr Ser Lys Ser 
625 

Ser Gin Ser Val 



Ser Ala Gin Glu 



He Gin Asp Ser 
360 

Ser Ser Leu Ser 
375 

He Glu Ser Lys 
390 

Asp Asn He Gin 
405 

Val Val Leu Gly 



Glu Lys Cys Lys 
440 

Asp Arg Lys He 
455 

Asp Gly Gly Asp 
470 

Ser Gly Thr His 
485 

Asn Thr Phe Ser 



Asn He Asp Glu 
520 

Leu Asp Ala Phe 
535 

Ser Phe Glu Glu 
550 

Asp Met Lys Gly 
565 

Ala Glu Ala Gly 



Glu Thr Val Asp 
600 

Glu Lys Ser Thr 
615 

Glu He Thr Asn 
630 

Gly Gly Ala Arg 
645 



Asp Ser Lys Ser 
345 

Ser Ser Ala Leu 



Cys Leu Pro Ala 
380 

Ala Arg Gly Asp 
395 

Asp Ala Val Thr 
410 

Gly Glu Pro Phe 
425 

Ser He Leu Leu 



Asp Pro Asp Gin 
460 

Thr Ser Ser Thr 
475 

Val Pro Glu Ser 
490 

Ser Asn Asp Met 
505 

Gly Ala Lys Ser 



Leu Thr Glu Gin 
540 

Asn Val Asn Asp 
555 

Leu Asp Asp Gly 
570 

Ala He Gly Glu 
585 

Lys Gin Asn Thr 



He Pro Val Gin 
620 

Gin Leu Ser Val 
635 

Pro Lys Gin Leu 
650 



Leu Asp Leu Lys 
350 

His Val Ser Ser 
365 

Ser Gly Ser Met 



Phe Leu Pro Gin 
400 

He His Glu Glu 
415 

Lys Glu Asn Asp 
430 

Gin Ser Leu He 
445 

Thr Val He Arg 



Val Val Glu Ser 
480 

Ser Asp Cys Cys 
495 

Asp Gly Gin Asp 
510 

Gly Pro Leu He 
525 

Tyr Leu Gin Thr 



Ser Lys Ser Gin 
560 

Asn He Asn Asn 
575 

Ser His Gly He 
590 

He Glu Asn Gly 
605 

Gin Gly Leu Pro 



Ser Asp He Asn 
640 

Phe Ser Leu Pro 
655 
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Ser Arg Thr Arg 
660 

Thr lie Glu Ser 
" 675 

Cys Ala lie Asp 
690 

Tyr lie Asp lie 
705 

Ala Asn Glu Asp 



Leu Gly Gin Lys 
740 

Cys Met Asn Cys 
755 

Cys Arg Ala Cys 
770 

Cys Lys Leu Gin 
785 

Tyr Glu Thr lie 



Thr Gly Ser Asn 
820 

Gin Pro Pro Gin 
835 

Leu Pro Val Ser 
850 

Lys Glu Gin Lys 
865 

Glu Val Ala Asp 



Glu Asp Phe Ser 
900 

Val Asp His Ser 
915 

Gly Asp lie Thr 
930 

Pro Ser Val Glu 
945 

Thr Ser Gly Ser 



Ser Ser Lys Asp 



Glu Pro Ser Thr 
680 

Ser Thr Ala Asp 
695 

Glu Ser Asn Ser 
710 

Ser Val Pro Glu 
725 

Gin Pro Thr Trp 



Gin Val Lys Phe 
760 

Gly Lys Val Phe 
775 

Tyr Leu Glu Lys 
790 

Ser Lys Ala Gin 
805 

Leu Lys Ser Asn 



Glu Asn Gin Thr 
840 

Ala Leu Lys Gin 
855 

Arg Val Trp Phe 
870 

Thr Thr Lys Leu 
885 

Pro Leu Ser Pro 



His Ser Thr Thr 
920 

Arg Asn Glu lie 
935 

Lys Leu Ser Met 
950 

Phe Thr Leu Asp 
965 



Leu Asn Lys Pro 
665 

Ala Asp Thr Val 



Pro Gin Val Ser 
700 

Glu Gly Gly Ser 
715 

Asn Thr Cys Lys 
730 

Val Pro Asp Ser 
745 

Thr Phe Thr Lys 



Cys Gly Val Cys 
780 

Glu Ala Arg Val 
795 

Ala Phe Glu Arg 
810 

His Ser Asp Glu 
825 

Ser Ser lie Pro 



Pro Gly Val Glu 
860 

Ala Asp Gly lie 
875 

Ser Ser Gly Ser 
890 

Asp Val Pro Met 
905 

Val Glu Lys Pro 



lie Gin Ser Pro 
940 

Asn Thr Gly Asn 
955 

Asp Asp Val Phe 
970 



Asp Val Pro Asp 
670 

Val Pro lie Thr 
685 

Phe Asn Ser Asn 



Ser Phe Val Thr 
720 

Glu Gly Leu Val 
735 

Glu Ala Pro Asn 
750 

Arg Arg His His 
765 

Cys Asn Arg Lys 



Cys Val Val Cys 
800 

Met Met Ser Pro 
815 

Cys Thr Thr Val 
830 

Ser Pro Ala Thr 
845 

Gly Leu Cys Ser 



Leu Pro Asn Gly 
880 

Lys Arg Cys Ser 
895 

Thr Val Asn Thr 
910 

Asn Asn Glu Thr 
925 

He Ser Gin Val 



Glu Gly Leu Pro 
960 

Ala Glu Thr Glu 

975 



Glu Pro Ser Ser Pro Thr Gly Val Leu Val Asn Ser Asn Leu Pro lie 
980 985 990 



Ala Ser lie Ser Asp Tyr Arg Leu Leu Cys Asp lie Asn Lys Tyr Val 
995 1000 1005 

Cys Asn Lys lie Ser Leu Leu Pro Asn Asp Glu Asp Ser Leu Pro Pro 
1010 1015 1020 

Leu Leu Val Ala Ser Gly Glu Lys Gly Ser Val Pro Val Val Glu Glu 
1025 1030 1035 1040 

His Pro Ser His Glu Gin lie lie Leu Leu Leu Glu Gly Glu Gly Phe 
1045 1050 1055 

His Pro Val Thr Phe Val Leu Asn Ala Asn Leu Leu Val Asn Val Lys 
1060 1065 1070 

Phe lie Phe Tyr Ser Ser Asp Lys Tyr Trp Tyr Phe Ser Thr Asn Gly 
1075 1080 1085 

Leu His Gly Leu Gly Gin Ala Glu lie lie lie Leu Leu Leu Cys Leu 
1090 1095 1100 

Pro Asn Glu Asp Thr lie Pro Lys Asp lie Phe Arg Leu Phe lie Thr 
1105 1110 1115 1120 

lie Tyr Lys Asp Ala Leu Lys Gly Lys Tyr lie Glu Asn Leu Asp Asn 
1125 1130 1135 

lie Thr Phe Thr Glu Ser Phe Leu Ser Ser Lys Asp His Gly Gly Phe 
1140 1145 1150 

Leu Phe lie Thr Pro Thr Phe Gin Lys Leu Asp Asp Leu Ser Leu Pro 
1155 1160 1165 

Ser Asn Pro Phe Leu Cys Gly lie Leu lie Gin Lys Leu Glu lie Pro 
1170 1175 1180 

Trp Ala Lys Val Phe Pro Met Arg Leu Met Leu Arg Leu Gly Ala Glu 
1185 1190 1195 1200 

Tyr Lys Ala Tyr Pro Ala Pro Leu Thr Ser lie Arg Gly Arg Lys Pro 
1205 1210 1215 

Leu Phe Gly Glu lie Gly His Thr lie Met Asn Leu Leu Val Asp Leu 
1220 1225 1230 

Arg Asn Tyr Gin Tyr Thr Leu His Asn lie Asp Gin Leu Leu lie His 
1235 1240 1245 

Met Glu Met Gly Lys Ser Cys lie Lys lie Pro Arg Lys Lys Tyr Ser 
1250 1255 1260 

Asp Val Met Lys Val Leu Asn Ser Ser Asn Glu His Val lie Ser He 
1265 1270 1275 1280 

Gly Ala Ser Phe Ser Thr Glu Ala Asp Ser His Leu Val Cys He Gin 
1285 1290 1295 



Asn Asp Gly He Tyr Glu Thr Gin Ala Asn Ser Ala Thr Gly His Pro 
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1300 



1305 



1310 



Arg Lys Val Thr Gly Ala Ser Phe Val Val Phe Asn Gly Ala Leu Lys 
1315 1320 1325 

Thr Ser Ser Gly Phe Leu Ala Lys Ser Ser lie Val Glu Asp Gly Leu 
1330 1335 1340 

Met Val Gin lie Thr Pro Glu Thr Met Asn Gly Leu Arg Leu Ala Leu 
1345 1350 1355 1360 

Arg Glu Gin Lys Asp Phe Lys lie Thr Cys Gly Lys Val Asp Ala Val 
1365 1370 1375 

Asp Leu Arg Glu Tyr Val Asp lie Cys Trp Val Asp Ala Glu Glu Lys 
1380 1385 1390 

Gly Asn Lys Gly Val lie Ser Ser Val Asp Gly lie Ser Leu Gin Gly 
1395 1400 1405 

Phe Pro Ser Glu Lys lie Lys Leu Glu Ala Asp Phe Glu Thr Asp Glu 
1410 1415 1420 

Lys lie Val Lys Cys Thr Glu Val Phe Tyr Phe Leu Lys Asp Gin Asp 
1425 1430 1435 1440 

Leu Ser lie Leu Ser Thr Ser Tyr Gin Phe Ala Lys Glu lie Ala Met 
1445 1450 1455 

Ala Cys Ser Ala Ala Leu Cys Pro His Leu Lys Thr Leu Lys Ser Asn 
1460 1465 1470 

Gly Met Asn Lys lie Gly Leu Arg Val Ser lie Asp Thr Asp Met Val 
1475 1480 1485 

Glu Phe Gin Ala Gly Ser Glu Gly Gin Leu Leu Pro Gin His Tyr Leu 
1490 1495 1500 

Asn Asp Leu Asp Ser Ala Leu He Pro Val He His Gly Gly Thr Ser 
1505 1510 1515 1520 

Asn Ser Ser Leu Pro Leu Glu He Glu Leu Val Phe Phe He He Glu 
1525 1530 1535 

His Leu Phe 



<210> 5 
<211> 4823 
<212> DNA 
<213> Xenopus 

<400> 5 

ctgtaagttt gactatgtag gaaagcattt ctgttatcta tgaagtatgt tttagagtca 60 

gaccaataac taaacggttt tctttttttt gtttatttcc cctcagatga gactgtctct 120 

ccaaagctat tagatgctaa gtggaatcaa atcttagaac cgcattcaca taaagtcgct 180 

gataactccg cccttgacaa tgtctgtaaa tcaatcattg ctattgaagc tcatctcaaa 240 

gtcaggtcac ccggcttgtc agcccttgtg aggtccacat atgtgaatgg agaagtaggt 300 

attgtggcac ctgaaatgcc caaaatggtg ataggagaca ccattatggc agaggattca 360 

ctttttaaca acactggtcc ctctgaaatt gtatgcaacc catctactgt ggagagtcaa 420 
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agtttacaag ctttagatga tcaatcagtg 
gctgatggct tttcaccatg cagtagcccc 
atggataacg aaatgccttt gcacagtcaa 
gtaacaattt cagtccttcc aacaatcata 
gctatcaata aacctggcac taaagaaccc 
attctgcctg aaataaagcc ttactccaca 
aaggttccca gttatcaatt aaataataca 
gcatgtagtg agcagcagca aaaacataca 
ggtgtttctg caacggagtc ctttgcagcc 
gctctccgta gtgctgaatt ctttgacatt 
attaatggcg acttgactaa aagttgtggc 
gcaagtaaag agtttgaagg aggggtagat 
gcttataaag aagcaataga tttgcctgaa 
tacaatgggt gtgattccta tggaatgaaa 
ttaccttcaa aagaagattc tgtgacagaa 
tactatactg gtgtttatga acaacaaaga 
ctgttaaatg ctaaggctga ccaaatgaag 
ccatccatgc atgggcaaac atcaccaaaa 
ccatacggtg gagcacgcac taagcagcca 
ttgtctgaaa tgttgcagag cgatctcatt 
aaaaatgaca tgttaaacaa atcaaatcag 
gaggattctg cagtgcgcag ccctgttact 
aggggacctg gcagcttgtg ccttgcagtg 
gccgggcagt ttggggtacc catctctaag 
gtctgggtgc cagattccca agcaccaaac 
accaaaagga ggcatcactg ccgagcttgt 
ctaaaatgca aactacagta catggataaa 
tctgtgctta tgaatgctca agcatgggag 
aatccaaata atcctgctga atactgctca 
tcaggagcac tgagttcccc acctcccact 
ccaggaactg aagggtcaca gtcaaaggaa 
ttacccaacg gagagactgc tgactcagat 
cttactgtgt cacataccaa caattccaca 
tgtggaagta taactcaggt tggcagtgca 
cctatactaa tctctactgg agtaaaagga 
atgtctgtga tgcagcaact agaggaagga 
gcaaatcttt tggccatggt taagatcgtg 
actacaaagg gaatgcatgc agtgggccag 
cctgatgaga agtgcctgcc gagggacctg 
gcaattgcag gtaatgtagt ggggaacctg 
ggtagtaagg atcatggtgg atttctttat 
ctggttcttc ctgcagagcc gtacttgttt 
tgggccaaag tgttccccat tcggcttatg 
ccatgtccac tcttcagtgt tcgatacaga 
atcattaatg ttctagccga tttcagaaac 
ttggtggtgg atatggaagt cagaaaaact 
gagatgatga aagcaatgaa caaatccaat 
aaccagatgg cagactctca ccttgtgtgt 
caggcaatta gtatccacaa acaaccacgt 
agtggtgcac taaagtcttc ttccggatac 
gtaatggttc agatcaccgc agagagcatg 
aaggatttca ccattacatg tggaaaagct 
gtccagtggg tggaggatga caagaacttt 
aaatcaatgg agtctgtgac cagcgtcaag 
ggaaaaataa ttcgctggat agaggtcttc 
ctgagtgacc ctgctgatca cagccgactc 
gcgctttgcc cacacctcaa gctactgaag 
gtgtcactgg actcagacca ggttggatac 
gcccgataca ccaatgattt ggatggtgct 
cagttaagtg aagggcctgt cagtatggag 
taggaaagac acatgtgtct cctcacaaac 



aatattcaca atgaaaaaag tgttctgctc 480 
aaaagtatta taaactttga ctgcttgacc 540 
atgagtgttg atgacaatga caaagaaact 600 
caggatacta gtaacgtaag cacagaccca 660 
catagagcat taaaggaaac cacatcagtt 720 
tgtgctgccc tttcgtttga aaataacaat 780 
gatctactca gcgtttcacc agtggttgaa 840 
tcttccttgc atgaagaaaa actttttgaa 900 
actgctgcgg aaactgtact ggataatgag 960 
gttgtaaaga acttttctga ctcttgtgtg 1020 
ctctctcaag aaagcaatga aaagttttgt 1080 
gctaatgtct tgttggaaaa tgcatgtgta 1140 
gaaaatggaa ctaatgcacc aatgtctctg 1200 
aacccagccg tagctcaaaa cccaaagaat 1260 
gaaaaagaaa ttgaagaaag caagtcagaa 1320 
gaagatgatg ttacagagag aggtggactt 1380 
aacaatttgc atagtctttg taatcaggtt 1440 
aagggcaaga ttgtgcaatc tctcagtgtt 1500 
actcatctca aactccatat tccaaagcca 1560 
cctccaaatg ctggctgcag ctctaaatac 1620 
ggggataacc tgatttcaga atcactgcgt 1680 
gatgctaatg gtgatttccc tggagaatac 1740 
tctccagaca gcccagacaa cgatctgctt 1800 
ccatttacta ctctagggga agtggctcca 1860 
tgcatgaagt gcgaggccag atttacattt 1920 
ggaaaggtgt tctgtgctgc ttgttgcagt 1980 
aaggaggctc gtgtgtgtgt tatttgtcat 2040 
aacatgttaa gtgcatcggt ccaaagccca 2100 
actatccctc cgatgcagca ggcacaagct 2160 
gtcatggtgc cagtgggtgt gttaaaacat 2220 
cagcgccgtg tttggtttgc tgatggaata 2280 
aatgcaaacg taactacagt ggctgggaca 2340 
tcttcagagt ctgagaacac ctctggattc 2400 
atgaacctta ttccagaaga tgggcttcct 2460 
gattacgcag ttgaggaacg cccttcccag 2520 
ggaccagatc ctttggtttt tgttctaaat 2580 
aactatgtta acaggaaatg ctggtgcttt 2640 
gctgagatcg taatcctgtt gcagtgcctg 2700 
tttagccatt ttgttgagct gtatcaggag 2760 
gggcattcct tcctcagcca gagtttcctg 2820 
gttgcaccaa cctaccagtc cctccaggac 2880 
ggaatcctta ttcaaaagtg ggagactcca 2940 
ctgcgtttag gtgcagaata cagattgtac 3000 
aaacctctgt ttggggaaac cggacacacc 3060 
tatcagtata ctctgccagt ggtgcagggc 3120 
agcattaaaa tccccagcaa tagatacaat 3180 
gagcatgtgt tggccatagg agcatgcttc 3240 
gtgcaaaacg atgatggcaa ttaccagacc 3300 
aaagtgaccg gggccagctt ctttgtcttc 3360 
ctggccaaat ccagcatagt agaagatggg 3420 
gatgccctca gacagtccct tcgggagatg 3480 
gatgcagagg agtcacagga acatgtctat 3540 
aacaaaggag tttttagtcc aatcgatggc 3600 
atttttcatg gctcagaata caaagctagt 3660 
tttctggaca atgaggagca acagagtggc 3720 
actgaaaatg tggccaaagc attctgttta 3780 
gaagatggaa tgaccaggtt aggtctgcgg 3840 
caagctggga gcaatgggca actcctgcct 3900 
ttggtaccag tgatacacgg gggcacatgc 3960 
ctgatatttt atatccttga gaacatctcc 4020 
tgccatcgcc caaaccattt gcactttaac 4080 
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cgcaaaagat tcatttttct tttcttttgc taacactagt attaggtcag ggtgcgagag 4140 
gcagacacct gaactcttaa accttctatg cattttcaca gtaaggatca agctgcagct 4200 
gggaatttcc tgttactaat ccaatgtggg acgttagaag tgatcggtgg cactgactat 4260 
ctagctgttc aaccttctct ggctcctcta aggactctag tgccaggggg tgagacattc 4320 
aagtttaaaa cgaaaactct aaatacaatc aggaatctca ctctgacctc atttaaatca 4380 
tcactgcgac "tttttttcct gctcgcattc tttattttgc atcttactca agtttacatt 4440 
gtcaagacca gcctaagcct tcagtccttt ctcaattaaa ctactcgtgc atggcaagga 4500 
gactttcgtt gcacagcctg aaatatacca atcacttccc aaaccacaag catgaatcca 4560 
acgttttcct gactggttgg ctctgctgtg aaagggacag caatattatt tttctacagt 4620 
tgacaaaact tttgtctatg tctgtgtctc tcatggggga tttgttgcct gatgggcagc 4680 
ctccggagag aagaattcca cccgtgtgta atatacagtc taagtgtatg gtctgctatg 4740 
taacacctgt tgcgcagtgc aaatgcactg actctctgga aggctataga gttttaaaaa 4800 
cggttagtct tttaaaaaaa aaa 4823 



<210> 6 
<211> 1235 
<212> PRT 
<213> Xenopus 

<400> 6 

Met Pro Lys Met Val lie Gly Asp Thr He Met Ala Glu Asp Ser Leu 
15 10 15 

Phe Asn Asn Thr Gly Pro Ser Glu He Val Cys Asn Pro Ser Thr Val 
20 25 30 

Glu Ser Gin Ser Leu Gin Ala Leu Asp Asp Gin Ser Val Asn He His 
35 40 45 

Asn Glu Lys Ser Val Leu Leu Ala Asp Gly Phe Ser Pro Cys Ser Ser 
50 55 60 

Pro Lys Ser He He Asn Phe Asp Cys Leu Thr Met Asp Asn Glu Met 
65 70 75 80 

Pro Leu His Ser Gin Met Ser Val Asp Asp Asn Asp Lys Glu Thr Val 
85 90 95 

Thr He Ser Val Leu Pro Thr He He Gin Asp Thr Ser Asn Val Ser 
100 105 110 

Thr Asp Pro Ala He Asn Lys Pro Gly Thr Lys Glu Pro His Arg Ala 
115 120 125 

Leu Lys Glu Thr Thr Ser Val He Leu Pro Glu He Lys Pro Tyr Ser 
130 135 140 

Thr Cys Ala Ala Leu Ser Phe Glu Asn Asn Asn Lys Val Pro Ser Tyr 
145 150 155 160 

Gin Leu Asn Asn Thr Asp Leu Leu Ser Val Ser Pro Val Val Glu Ala 
165 170 175 

Cys Ser Glu Gin Gin Gin Lys His Thr Ser Ser Leu His Glu Glu Lys 
180 185 190 

Leu Phe Glu Gly Val Ser Ala Thr Glu Ser Phe Ala Ala Thr Ala Ala 
195 200 205 

Glu Thr Val Leu Asp Asn Glu Ala Leu Arg Ser Ala Glu Phe Phe Asp 
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210 



215 



220 



He Val Val Lys 
225 

Thr Lys Ser Cys 



Ser Lys Glu Phe 
260 

Ala Cys Val Ala 
275 

Thr Asn Ala Pro 
290 

Lys Asn Pro Ala 
305 

Asp Ser Val Thr 



Tyr Thr Gly Val 
340 

Gly Gly Leu Leu 
355 

His Ser Leu Cys 
370 

Lys Lys Gly Lys 
385 

Arg Thr Lys Gin 



Ser Glu Met Leu 
420 

Ser Lys Tyr Lys 
435 

Leu He Ser Glu 
450 

Thr Asp Ala Asn 
465 

Leu Cys Leu Ala 



Gly Gin Phe Gly 
500 

Val Ala Pro Val 
515 

Cys Glu Ala Arg 
530 



Asn Phe Ser Asp 
230 

Gly Leu Ser Gin 
245 

Glu Gly Gly Val 



Tyr Lys Glu Ala 

280 

Met Ser Leu Tyr 
295 

Val Ala Gin Asn 
310 

Glu Glu Lys Glu 
325 

Tyr Glu Gin Gin 



Leu Asn Ala Lys 
360 

Asn Gin Val Pro 
375 

He Val Gin Ser 
390 

Pro Thr His Leu 
405 

Gin Ser Asp Leu 



Asn Asp Met Leu 
440 

Ser Leu Arg Glu 
455 

Gly Asp Phe Pro 
470 

Val Ser Pro Asp 
485 

Val Pro He Ser 



Trp Val Pro Asp 
520 

Phe Thr Phe Thr 

535 



Ser Cys Val He 
235 

Glu Ser Asn Glu 
250 

Asp Ala Asn Val 
265 

He Asp Leu Pro 



Asn Gly Cys Asp 
300 

Pro Lys Asn Leu 
315 

He Glu Glu Ser 
330 

Arg Glu Asp Asp 
345 

Ala Asp Gin Met 



Ser Met His Gly 
380 

Leu Ser Val Pro 
395 

Lys Leu His He 
410 

He Pro Pro Asn 
425 

Asn Lys Ser Asn 



Asp Ser Ala Val 
460 

Gly Glu Tyr Arg 
475 

Ser Pro Asp Asn 
490 

Lys Pro Phe Thr 
505 

Ser Gin Ala Pro 



Lys Arg Arg His 
540 



Asn Gly Asp Leu 
240 

Lys "Phe Cys Ala 
255 

Leu Leu Glu Asn 
270 

Glu Glu Asn Gly 
285 

Ser Tyr Gly Met 



Pro Ser Lys Glu 
320 

Lys Ser Glu Tyr 
335 

Val Thr Glu Arg 
350 

Lys Asn Asn Leu 
365 

Gin Thr Ser Pro 



Tyr Gly Gly Ala 
400 

Pro Lys Pro Leu 
415 

Ala Gly Cys Ser 
430 

Gin Gly Asp Asn 
445 

Arg Ser Pro Val 



Gly Pro Gly Ser 
480 

Asp Leu Leu Ala 
495 

Thr Leu Gly Glu 
510 

Asn Cys Met Lys 
525 

His Cys Arg Ala 
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Cys Gly Lys Val 
545 

Gin Tyr Met Asp 



Val Leu Met Asn 
580 

Gin Ser Pro Asn 

595 

Pro Met Gin Gin 
610 

Thr Val Met Val 
625 

Ser Gin Ser Lys 



Pro Asn Gly Glu 
660 

Ala Gly Thr Leu 

675 

Ser Glu Asn Thr 
690 

Ala Met Asn Leu 
705 

Thr Gly Val Lys 



Ser Val Met Gin 
740 

Val Leu Asn Ala 
755 

Asn Arg Lys Cys 
770 

Gin Ala Glu lie 
785 

Leu Pro Arg Asp 



lie Ala Gly Asn 
820 

Ser Phe Leu Gly 
835 

Thr Tyr Gin Ser 
850 



Phe Cys Ala Ala 
550 

Lys Lys Glu Ala 
565 

Ala Gin Ala Trp 



Pro Asn Asn Pro 
600 

Ala Gin Ala Ser 
615 

Pro Val Gly Val 
630 

Glu Gin Arg Arg 
645 

Thr Ala Asp Ser 



Thr Val Ser His 
680 

Ser Gly Phe Cys 
695 

lie Pro Glu Asp 
710 

Gly Asp Tyr Ala 
725 

Gin Leu Glu Glu 



Asn Leu Leu Ala 
760 

Trp Cys Phe Thr 
775 

Val lie Leu Leu 
790 

Leu Phe Ser His 
805 

Val Val Gly Asn 



Ser Lys Asp His 
840 

Leu Gin Asp Leu 
855 



Cys Cys Ser Leu 
555 

Arg Val Cys Val 
570 

Glu Asn Met Leu 
585 

Ala Glu Tyr Cys 



Gly Ala Leu Ser 
620 

Leu Lys His Pro 
635 

Val Trp Phe Ala 
650 

Asp Asn Ala Asn 
665 

Thr Asn Asn Ser 



Gly Ser lie Thr 
700 

Gly Leu Pro Pro 
715 

Val Glu Glu Arg 
730 

Gly Gly Pro Asp 
745 

Met Val Lys lie 



Thr Lys Gly Met 
780 

Gin Cys Leu Pro 
795 

Phe Val Glu Leu 
810 

Leu Gly His Ser 
825 

Gly Gly Phe Leu 



Val Leu Pro Ala 
860 



Lys Cys Lys Leu 
560 

lie Cys His Ser 
575 

Ser Ala Ser Val 
590 

Ser Thr lie Pro 
605 

Ser Pro Pro Pro 



Gly Thr Glu Gly 
640 

Asp Gly lie Leu 
655 

Val Thr Thr Val 
670 

Thr Ser Ser Glu 
685 

Gin Val Gly Ser 



lie Leu lie Ser 

720 

Pro Ser Gin Met 
735 

Pro Leu Val Phe 
750 

Val Asn Tyr Val 
765 

His Ala Val Gly 



Asp Glu Lys Cys 
800 

Tyr Gin Glu Ala 
815 

Phe Leu Ser Gin 
830 

Tyr Val Ala Pro 
845 

Glu Pro Tyr Leu 
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Phe Gly lie Leu He Gin Lys Trp Glu Thr Pro Trp Ala Lys Val Phe 
865 870 875 880 



Pro He Arg Leu Met Leu Arg Leu Gly Ala Glu Tyr Arg Leu Tyr Pro 
885 890 895 

Cys Pro Leu Phe Ser Val Arg Tyr Arg Lys Pro Leu Phe Gly Glu Thr 
900 905 910 

Gly His Thr He He Asn Val Leu Ala Asp Phe Arg Asn Tyr Gin Tyr 
915 920 925 

Thr Leu Pro Val Val Gin Gly Leu Val Val Asp Met Glu Val Arg Lys 
930 935 940 

Thr Ser He Lys He Pro Ser Asn Arg Tyr Asn Glu Met Met Lys Ala 
945 950 955 960 

Met Asn Lys Ser Asn Glu His Val Leu Ala He Gly Ala Cys Phe Asn 
965 970 975 

Gin Met Ala Asp Ser His Leu Val Cys Val Gin Asn Asp Asp Gly Asn 
980 985 990 

Tyr Gin Thr Gin Ala He Ser He His Lys Gin Pro Arg Lys Val Thr 
995 1000 1005 

Gly Ala Ser Phe Phe Val Phe Ser Gly Ala Leu Lys Ser Ser Ser Gly 
1010 1015 1020 

Tyr Leu Ala Lys Ser Ser He Val Glu Asp Gly Val Met Val Gin He 
1025 1030 1035 1040 

Thr Ala Glu Ser Met Asp Ala Leu Arg Gin Ser Leu Arg Glu Met Lys 
1045 1050 1055 

Asp Phe Thr lie Thr Cys Gly Lys Ala Asp Ala Glu Glu Ser Gin Glu 
1060 1065 1070 

His Val Tyr Val Gin Trp Val Glu Asp Asp Lys Asn Phe Asn Lys Gly 
1075 1080 1085 



Val Phe Ser Pro 
1090 

Lys He Phe His 
1105 



He Asp Gly Lys 
1095 

Gly Ser Glu Tyr 
1110 



Ser Met Glu Ser 
1100 

Lys Ala Ser Gly 
1115 



Val Thr Ser Val 

Lys He He Arg 
1120 



Trp He Glu Val Phe Phe Leu Asp Asn Glu Glu Gin Gin Ser Gly Leu 
1125 H30 1135 

Ser Asp Pro Ala Asp His Ser Arg Leu Thr Glu Asn Val Ala Lys Ala 
1140 H45 1150 

Phe Cys Leu Ala Leu Cys Pro His Leu Lys Leu Leu Lys Glu Asp Gly 
1155 1160 1165 

Met Thr Arg Leu Gly Leu Arg Val Ser Leu Asp Ser Asp Gin Val Gly 
1170 H75 1180 

Tyr Gin Ala Gly Ser Asn Gly Gin Leu Leu Pro Ala Arg Tyr Thr Asn 
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1185 



1190 



1195 



1200 



Asp Leu Asp Gly Ala Leu Val Pro Val lie His Gly Gly Thr Cys Gin 
1205 1210 1215 

Leu Ser Glu Gly Pro Val Ser" Met Glu Leu Ile~Phe Tyr lie Leu Glu 
1220 1225 1230 

Asn lie Ser 

1235 



<210> 7 
<211> 2678 
<212> DNA 
<213> Xenopus 



<400> 7 

agttttattt 

aaacattact 

ctttctttcc 

attcttagaa 

atcaatcatt 

aaggtccaca 

gataggagac 

atgcaactct 

taacaatgaa 

tgttctaaac 

tgttaatgac 

tactactaac 

tgcattaaaa 

tgccctttca 

actcagcatt 

cttgcatgaa 

tgaaactgta 

gaacgtttct 

agaaaacaat 

cttattggaa 

aactaatgca 

tgtacgtcaa 

aattgaagaa 

cataactgag 

gcatagtctt 

gattgtgcaa 

caaactcaat 

tgctggatgc 

cctgatttca 

agtccctcga 

tgacaatgat 

aggggatgtg 

ggccagattt 

agaaatgtgg 

gttcagcaga 

cctcattttg 

agcttgccta 

aaaggttatt 

aaaaactagg 

tttctacagt 

aaagtgcctt 

atcttttttg 

ccgagaccta 



tcagaagacg 
gttgtatata 
tcctcagatg 
ccacattcgc 
gctattgaag 
tatgtgaatg 
accgatatgg 
attgtggaga 
aaaagtgttc 
tctgcttgct 
caagataaag 
gtaagcacag 
caaaccacat 
tgtgaaaata 
tcaccagtgg 
aaaaaacttg 
ctaggtaatg 
gactcgtgtg 
gaaatgtttt 
gatgcatgca 
ccaatgtatc 
aacccaaaga 
agcaagtcag 
agaggtggag 
tataatccgg 
tccctcagtg 
attccacagc 
agctctaaaa 
gaatcactac 
gaaaacaggg 
ctgcttgctg 
gctccagtct 
acatttacca 
tgtttcatca 
ttcagttaaa 
aagtagataa 
aaagtagatg 
tgtaaaaagt 
tatgtaacgt 
gcagtgattt 
t aataaaaat 
tctcatttct 
tcttgtcctc 



ttgcatcttt 
cagtatgttg 
aaactgtctt 
ataaagtcac 
ctcatctcaa 
gagaagtagg 
cagaggattc 
gtcaaagttt 
ttcttgatga 
tgaccatgaa 
aagctgtaac 
acccagcttt 
cagttattct 
tcaacaagat 
ttgaagcatg 
cagctgtgtc 
aagctctcca 
tgtttaatgg 
atgcaagtaa 
tagcttataa 
tgtacaatgg 
atttaccatc 
aatactattc 
t cttgt taaa 
ttccatccat 
ttccatatgg 
cattgtctga 
acaaaaatga 
gtgaggaagt 
gacctggcag 
gacagtttgg 
gggtgccaga 
aaaggaggca 
gggcaacagt 
gtagacttat 
aatatatttt 
ttctgtatat 
acaaaatggg 
atattaaaat 
gtataaccat 
tattttataa 
tggaat cgtt 
taaagtaatt 



attttaaaca 
tagacatata 
tccaaagctg 
tgataaacca 
agt caggtca 
tattgtgact 
actttttaac 
agaagtttta 
tggattttct 
taacggaaag 
aatttcagtc 
caataaatct 
gcctgaaata 
acccagatgt 
tagtgagaag 
tgcaactgcg 
tagtgctgat 
tgacctaact 
agagttggaa 
agaaagaata 
gtgtgattcc 
aaaagaagat 
tggtgtttat 
tgccaaggtt 
gcatgggcaa 
tggagctcgc 
aatgttacag 
catgttaaac 
gcacagccct 
cctgtgcctt 
ggtacccatc 
ttcccaagca 
tcactgccga 
aatcacggca 
aagttacaca 
attaggaaac 
tatttggtag 
tagagactag 
aattttatga 
gcaattatca 
attatcatat 
ctacttatgt 
ggcttgtcaa 



ttaagtttca 
acgtaactgt 
ttagatgcta 
gctcttgaca 
cccagcttga 
cctgaaatgc 
actggtccct 
gatgatgtac 
ccgtacagta 
ccctcacacg 
cttccaatga 
ggcactgaag 
aagccttatt 
caattaaata 
cagcaaaatc 
ttctttccag 
ttttttgaca 
agaactaatg 
ggaggggtag 
gatttgtctg 
tatggaatga 
tctgtgacag 
gaacaacaga 
gaccaaatga 
acctcaccaa 
cccaagcagc 
tgtgatctca 
aaatcaaatc 
gttactgata 
gcagtgtctc 
tctaagccat 
ccaaactgca 
gcttgtggaa 
aattattcat 
gtaacaattc 
tctggggaga 
tcaaagatga 
acaataaaaa 
ttttaatatt 
aatgcttagt 
tttctttata 
tctactgata 
ctggctgtag 



ctatgtagta 
ttgctttgtg 
agtggaatca 
atgtctgtaa 
cagcccttgc 
ctaaaatggt 
ctgaaattgt 
cagtgagtat 
gccccaaaag 
gtcaaaaaat 
tcatacagga 
aagcttatag 
ccatacaggc 
atacagatct 
atacaacttc 
tcactgctgc 
ttgttgtaaa 
gactctcaca 
atgctaatat 
aagaaaatgg 
aaaaccctgc 
aagaaaaaga 
aggaagatga 
agaacagttt 
aaaagggcaa 
caact catct 
ttccgccaaa 
ggggggataa 
caaatggtga 
cagacagccc 
ttactactct 
tgaagtgcga 
aggtatgtaa 
aacaaaatgt 
atctgctcag 
tataagggaa 
tttcatgaaa 
gtaaggagta 
tactgcacat 
gccttcacac 
tgtagtcatc 
tgttttttac 
ggggattttc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 
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agagttatag cttagtactg ttaatgagcc ataggttgaa atagtgctct agatttacat 2640 
gttgtacaac agttattgca atatgtgtag gggggggg 2678 



<210> 8 
<211> 561 
<212> PRT 
<213> Xenopus 

<400> 8 

Met Pro Lys Met Val He Gly Asp Thr Asp Met Ala Glu Asp Ser Leu 
15 10 15 

Phe Asn Thr Gly Pro Ser Glu He Val Cys Asn Ser He Val Glu Ser 
20 25 30 

Gin Ser Leu Glu Val Leu Asp Asp Val Pro Val Ser He Asn Asn Glu 
35 40 45 

Lys Ser Val Leu Leu Asp Asp Gly Phe Ser Pro Tyr Ser Ser Pro Lys 
50 55 60 

Ser Val Leu Asn Ser Ala Cys Leu Thr Met Asn Asn Gly Lys Pro Ser 
65 70 75 80 

His Gly Gin Lys He Val Asn Asp Gin Asp Lys Glu Ala Val Thr He 
85 90 95 

Ser Val Leu Pro Met He He Gin Asp Thr Thr Asn Val Ser Thr Asp 
100 105 110 

Pro Ala Phe Asn Lys Ser Gly Thr Glu Glu Ala Tyr Ser Ala Leu Lys 
115 120 125 

Gin Thr Thr Ser Val He Leu Pro Glu He Lys Pro Tyr Ser He Gin 
130 135 140 

Ala Ala Leu Ser Cys Glu Asn He Asn Lys He Pro Arg Cys Gin Leu 
145 150 155 160 

Asn Asn Thr Asp Leu Leu Ser He Ser Pro Val Val Glu Ala Cys Ser 
165 170 175 

Glu Lys Gin Gin Asn His Thr Thr Ser Leu His Glu Lys Lys Leu Ala 
180 185 190 

Ala Val Ser Ala Thr Ala Phe Phe Pro Val Thr Ala Ala Glu Thr Val 
195 200 205 

Leu Gly Asn Glu Ala Leu His Ser Ala Asp Phe Phe Asp He Val Val 
210 215 220 

Lys Asn Val Ser Asp Ser Cys Val Phe Asn Gly Asp Leu Thr Arg Thr 
225 230 235 240 

Asn Gly Leu Ser Gin Glu Asn Asn Glu Met Phe Tyr Ala Ser Lys Glu 
245 250 255 

Leu Glu Gly Gly Val Asp Ala Asn .He Leu Leu Glu Asp Ala Cys He 
260 265 270 
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Ala Tyr Lys Glu Arg lie Asp Leu Ser Glu Glu Asn Gly Thr Asn Ala 
275 280 285 

Pro Met Tyr Leu Tyr Asn Gly Cys Asp Ser Tyr Gly Met Lys Asn Pro 
290 295 300 

Ala Val Arg Gin Asn Pro Lys Asn Leu Pro Ser Lys Glu Asp Ser Val 
305 310 315 320 

Thr Glu Glu Lys Glu lie Glu Glu Ser Lys Ser Glu Tyr Tyr Ser Gly 
325 330 335 

Val Tyr Glu Gin Gin Lys Glu Asp Asp lie Thr Glu Arg Gly Gly Val 
340 345 350 

Leu Leu Asn Ala Lys Val Asp Gin Met Lys Asn Ser Leu His Ser Leu 
355 360 365 

Tyr Asn Pro Val Pro Ser Met His Gly Gin Thr Ser Pro Lys Lys Gly 
370 375 380 

Lys lie Val Gin Ser Leu Ser Val Pro Tyr Gly Gly Ala Arg Pro Lys 
385 390 395 400 

Gin Pro Thr His Leu Lys Leu Asn lie Pro Gin Pro Leu Ser Glu Met 
405 410 415 

Leu Gin Cys Asp Leu lie Pro Pro Asn Ala Gly Cys Ser Ser Lys Asn 
420 425 . 430 

Lys Asn Asp Met Leu Asn Lys Ser Asn Arg Gly Asp Asn Leu lie Ser 
435 440 445 

Glu Ser Leu Arg Glu Glu Val His Ser Pro Val Thr Asp Thr Asn Gly 
450 455 460 

Glu Val Pro Arg Glu Asn Arg Gly Pro Gly Ser Leu Cys Leu Ala Val 
465 470 475 480 

Ser Pro Asp Ser Pro Asp Asn Asp Leu Leu Ala Gly Gin Phe Gly Val 
485 490 495 

Pro lie Ser Lys Pro Phe Thr Thr Leu Gly Asp Val Ala Pro Val Trp 
500 505 510 

Val Pro Asp Ser Gin Ala Pro Asn Cys Met Lys Cys Glu Ala Arg Phe 
515 520 525 

Thr Phe Thr Lys Arg Arg His His Cys Arg Ala Cys Gly Lys Val Cys 
530 535 540 

Lys Glu Met Trp Cys Phe lie Arg Ala Thr Val lie Thr Ala Asn Tyr 
545 550 555 560 

Ser 
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